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Preface 



For more than two thousand years some familiarity with mathe- 
matics has been regarded as an indispensable part of the intellec- 
tual equipment of every cultured person. Today the traditional 
place of mathematics in education is in grave danger. 

These opening sentences to the preface of the classical book "What Is Math- 
ematics?" were written by Richard Courant in 1941. It is somewhat soothing to 
learn that the problems that we tend to associate with the current situation were 
equally acute 65 years ago (and, most probably way earlier as well). This is not to 
say that there are no clouds on the horizon, and by this book we hope to make a 
modest contribution to the continuation of the mathematical culture. 

The first mathematical book that one of our mathematical heroes, Vladimir 
Arnold, read at the age of twelve, was "Von Zahlen und Figuren" 1 by Hans Rademacher 
and Otto Toeplitz. In his interview to the "Kvant" magazine, published in 1990, 
Arnold recalls that he worked on the book slowly, a few pages a day. We cannot 
help hoping that our book will play a similar role in the mathematical development 
of some prominent mathematician of the future. 

We hope that this book will be of interest to anyone who likes mathematics, 
from high school students to accomplished researchers. We do not promise an easy 
ride: the majority of results are proved, and it will take a considerable effort from 
the reader to follow the details of the arguments. We hope that, in reward, the 
reader, at least sometimes, will be filled with awe by the harmony of the subject 
(this feeling is what drives most of mathematicians in their work!) To quote from 
"A Mathematician's Apology" by G. H. Hardy, 

The mathematician's patterns, like the painter's or the poet's, 
must be beautiful; the ideas, like the colors or the words, must 
fit together in a harmonious way. Beauty is the first test: there 
is no permanent place in the world for ugly mathematics. 
For us too, beauty is the first test in the choice of topics for our own research, 
as well as the subject for popular articles and lectures, and consequently, in the 
choice of material for this book. We did not restrict ourselves to any particular 
area (say, number theory or geometry), our emphasis is on the diversity and the 
unity of mathematics. If, after reading our book, the reader becomes interested in 
a more systematic exposition of any particular subject, (s)he can easily find good 
sources in the literature. 

About the subtitle: the dictionary definition of the word classic, used in the 
title, is "judged over a period of time to be of the highest quality and outstanding 



1 "The enjoyment of mathematics" , in the English translation; the Russian title was a literal 
translation of the German original. 
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of its kind" . We tried to select mathematics satisfying this rigorous criterion. The 
reader will find here theorems of Isaac Newton and Leonhard Eulcr, Augustin Louis 
Cauchy and Carl Gustav Jacob Jacobi, Michel Chasles and Pafnuty Chebyshev, 
Max Dehn and James Alexander, and many other great mathematicians of the past. 
Quite often we reach recent results of prominent contemporary mathematicians, 
such as Robert Connelly, John Conway and Vladimir Arnold. 

There are about four hundred figures in this book. We fully agree with the 
dictum that a picture is worth a thousand words. The figures are mathematically 
precise - so a cubic curve is drawn by a computer as a locus of points satisfying 
an equation of degree three. In particular, the figures illustrate the importance of 
accurate drawing as an experimental tool in geometrical research. Two examples are 
given in Lecture 29: the Money-Coutts theorem, discovered by accurate drawing 
as late as in the 1970s, and a very recent theorem by Richard Schwartz on the 
Poncelet grid which he discovered by computer experimentation. Another example 
of using computer as an experimental tool is given in Lecture 3 (see the discussion 
of "privileged exponents"). 

We did not try to make different lectures similar in their length and level 
of difficulty: some are quite long and involved whereas others are considerably 
shorter and lighter. One lecture, "Cusps", stands out: it contains no proofs but 
only numerous examples, richly illustrated by figures; many of these examples are 
rigorously treated in other lectures. The lectures are independent of each other but 
the reader will notice some themes that reappear throughout the book. We do not 
assume much by way of preliminary knowledge: a standard calculus sequence will 
do in most cases, and quite often even calculus is not required (and this relatively 
low threshold does not leave out mathematically inclined high school students). 
We also believe that any reader, no matter how sophisticated, will find surprises in 
almost every lecture. 

There are about 200 exercises in the book, many provided with solutions or an- 
swers. They further develop the topics discussed in the lectures; in many cases, they 
involve more advanced mathematics (then, instead of a solution, we give references 
to the literature). 

This book stems from a good many articles we wrote for the Russian magazine 
"Kvant" over the years 1970-1990 2 and from numerous lectures that we gave over 
the years to various audiences in the Soviet Union and the United States (where we 
live since 1990). These include advanced high school students - the participants of 
the Canada/USA Binational Mathematical Camp in 2001 and 2002, undergraduate 
students attending the Mathematics Advanced Study Semesters (MASS) program 
at Penn State over the years 2000-2006, high school students - along with their 
teachers and parents - attending the Bay Area Mathematical Circle at Berkeley 

The book may be used for an undergraduate Honors Mathematics Seminar 
(there is more than enough material for a full academic year), various topics courses, 
Mathematical Clubs at high school or college, or simply as a "coffee table book" to 
browse through, at one's leisure. 

To support the "coffee table book" claim, this volume is lavishly illustrated by 
an accomplished artist, Sergey Ivanov. Sergey was the artist-in-chicf of the "Kvant" 
magazine in the 1980s, and then continued, in a similar position, in the 1990s, at 
its English-language cousin, "Quantum". Being a physicist by education, Ivanov's 



2 Available, in Russian, online at http://kvant.mccme.ru/ 
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illustrations are not only aesthetically attractive but also reflect the mathematical 
content of the material. 

We started this preface with a quotation; let us finish with another one. Max 
Dehn, whose theorems are mentioned here more than once, thus characterized math- 
ematicians in his 1928 address [22]; we believe, his words apply to the subject of 
this book: 

At times the mathematician has the passion of a poet or a con- 
queror, the rigor of his arguments is that of a responsible states- 
man or, more simply, of a concerned father, and his tolerance 
and resignation are those of an old sage; he is revolutionary and 
conservative, skeptical and yet faithfully optimistic. 

Acknowledgments. This book is dedicated to V. I. Arnold on the occasion of 
his 70th anniversary; his style of mathematical research and exposition has greatly 
influenced the authors over the years. 

For two consecutive years, in 2005 and 2006, we participated in the "Research in 
Pairs" program at the Mathematics Institute at Oberwolfach. We are very grateful 
to this mathematicians' paradise where the administration, the cooks and nature 
conspire to boost one's creativity. Without our sojourns at MFO the completion 
of this project would still remain a distant future. 

The second author is also grateful to Max-Planck-Institut for Mathematics in 
Bonn for its invariable hospitality. 

Many thanks to John Duncan, Sergei Gelfand and Giinter Ziegler who read 
the manuscript from beginning to end and whose detailed (and almost disjoint!) 
comments and criticism greatly improved the exposition. 

The second author gratefully acknowledges partial NSF support. 
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Algebra and Arithmetics 




LECTURE 1 

Can a Number be Approximately Rational? 



1.1 Prologue. Alice 1 (entering through a door on the left): I can prove that 
y/2 is irrational. 

Bob (entering through a door on the right): But it is so simple: take a calculator, 
press the button | V~ | , then 2 , and you will see the square root of 2 in the screen. 



It's obvious that it is irrational: 



1. 


4 


1 


4 


2 


1 


3 


5 


6 


2 



Alice: Some proof indeed! What if y/2 is a periodic decimal fraction, but the period 
is longer than your screen? If you use your calculator to divide, say 25 by 17, you 
will also get a messy sequence of digits: 



1. 


4 


7 





5 


8 


8 


2 


3 


5 



But this number is rational! 

Bob: You may be right, but for numbers arising in real life problems my method 
usually gives the correct result. So, I can rely on my calculator in determining 
which numbers are rational, and which are irrational. The probability of mistake 
will be very low. 

Alice: I do not agree with you (leaves through a door on the left). 
Bob: And I do not agree with you (leaves through a door on the right). 



1 See D. Knuth, "Surreal Numbers". 

5 



6 



LECTURE 1. CAN A NUMBER BE APPROXIMATELY RATIONAL? 



1.2 Who is right? We asked many people, and everybody says: Alice. If you 
know nine (or ninety, or nine million) decimal digits of a number, you cannot say 
whether it is rational or irrational: there are infinitely many rational and irrational 
numbers with the same beginning of their decimal fraction. 

But still the two numbers displayed in Section 1.1, however similar they might 
look, are different in one important way. The second one is very close to the 

25 25 . 

rational number — : the difference between 1.470588235 and — is approximately 

3 • 1CP 10 . As for 1.414213562, there are no fractions with a two-digit denominator 

99 

this close to it; actually, of such fractions, the closest to 1.414213562 is — , and 

the difference between the two numbers is approximately 7 • 1CP 5 . The shortest 

47321 

fraction approximating \/2 with an error of 3 • 1CP 10 is — — — , much longer than 

33461 

25 

just — . What is more important, this difference between the two nine-digit decimal 

fractions (not transparent to the naked eye) can be easily detected by a primitive 
pocket calculator. 

To give some support to Bob in his argument with Alice, you can show your 
friends a simple trick. 

1.3 A trick. You will need a pocket calculator which can add, subtract, multi- 



ply and divide (a key x 1 will be helpful) . Have somebody give you two nine-digits 



decimals, say, between 0.5 and 1, for example, 

0.635149023 and 0.728101457. 

One of these numbers has to be obtained as a fraction with its denominator less 
than 1000 (known to the audience), another one should be random. You claim 
that you can find out in one minute which of the two numbers is a fraction and, in 
another minute, find the fraction itself. You arc allowed to use your calculator (the 
audience will see what you do with it). 

How to do it? We shall explain this in this lecture (see Section 1.13). Informally 
speaking, one of these numbers is "approximately rational" , while the other is not 
- whatever this means. 

1.4 What is a good approximation? Let a be an irrational number. How 

P 

can we decide whether a fraction - (which we can assume irreducible) is a good 

P 

a ; we want it 

q 

to be small. But this is not all: a fraction should be convenient, that is, the numbers 
p and q should not be too big. It is reasonable to require that the denominator q 
is not too big: the size of p depends on a which is not related to the precision of 

P 

the approximation. So, we want to minimize two numbers, the error a and 

q 

the denominator q. But the two goals contradict each other: to make the error 
smaller we must take bigger denominators, and vice versa. To reconcile the two 
contradicting demands, we can combine them into one "indicator of quality" of an 

P 



approximation for of! The first thing which matters, is the error, 



P 

approximation. Let us call an approximation - of a good if the product 



■q is 
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1 1 

small, say, less than — - or — — — — . The idea seems reasonable, but the following 

' J ' 100 1000000 

theorem sounds discouraging. 

Theorem 1.1. For any a and any e > there exist infinitely many fractions 

— such that 
1 



P 

a 

1 



< e. 



In other words, all numbers have arbitrarily good approximations, so we cannot 
distinguish numbers by the quality of their rational approximations. 

Our proof of Theorem 1.1 is geometric, and the main geometric ingredient of 
this proof is a "lattice" . Since lattices will be useful also in subsequent sections, we 
shall discuss their relevant properties in a separate section. 

1.5 Lattices. Let O be a point in the plane (the "origin"), and let v — OA 
and w = OB be two non-collincar vectors (which means that the points O, A, B do 
not lie on one line). Consider the set of all points (endpoints of the vectors) pv + qw 
(Figure 1.1). This is a lattice (generated by v and w). We need the following two 
propositions (of which only the first is needed for the proof of Theorem 1.1). 



•c 




Figure 1.1. The lattice generated by v and w 

Let A be a lattice in the plane generated by the vectors v and w. 

PROPOSITION 1.1. Let KLMN be a parallelogram such that the vertices K,L, 
and M belong to A. Then N also belongs to A. 

Proof. Let OK = av + bw, OL = cv + dw, OM = ev + /w. Then 

ON = OK + K~N = OK + L~M = 0~K + (OM - OL) 
= (a-c+e)v+(b-d+ /)w, 

hence N e A. □ 

Denote the area of the "elementary" parallelogram OACB (where OC = OA + 
OB) by s. 



8 
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Proposition 1.2. Let KLMN be a parallelogram with vertices in A. 

(a) The area of KLMN is ns where n is a positive integer. 

(b) Lfno point of A, other than K, L, M, N, lies inside the parallelogram KLMN 
or on its boundary, then the area of KLMN equals s. 

(For a more general statement, Pick's formula, see Exercise 1.1.) 

Proof of (b). Let £ be the length of the longer of the two diagonals of KLMN . 

Tile the plane by parallelograms parallel to KLMN . For a tile 7r, denote by 
K n the vertex of it corresponding to K under the parallel translation KLMN — > n. 
Then it <-> K n is a one-to-one correspondence between the tiles and the points of 
the lattice A. (Indeed, no point of A lies inside any tile or inside a side of any tile; 
hence, every point of A is a K^ for some ir.) Let Dr be the disk of radius R centered 
at O, and let N be the number of points of A within Dr. Denote the points of A 
within Dr by K\ , K2, ■ ■ ■ , -ft' at. Let Ki — K v . . The union of all tiles iri (1 < i < N) 
contains Dr_i and is contained in Dr + £. Thus, if the area of KLMN is S, then 

ir(R-£) 2 <NS< ir(R + £) 2 . 

The same is true (maybe, with a different £, but we can take the bigger of the 
two £'s) for the parallelogram OACB, which also does not contain any point of A 
different from its vertices; thus, 

ir(R-£) 2 <Ns< tt(R + £) 2 . 

Division of the inequalities shows that 

(R-£) 2 < S < (R + £) 2 



(R + £) 2 ~ s ~ (R~£) 2 ' 
(R-£) 2 

and, since -. -ttt for big R is arbitrarily close to 1, that S = s. 

(R + £) z 

Proof of (a). First, notice that if a triangle PQR with vertices in A does not 
contain (either inside or on the boundary) points of A different from P,Q,R, then 
its area is -: this triangle is a half of a parallelogram PQRS which also contains no 

points of A different from its vertices, and 5* € A by Proposition 1.1. Thus, the area 
of the parallelogram PQRS is s (by Part (b)) and the area of the triangle PQR is 

-. Now, if our parallelogram KLMN contains q points of A inside and p points 

on the sides (other than K, L, M, N), then p is even (opposite sides contain equal 
number of points of A) and the parallelogram KLMN can be cut into 2q + p + 2 
triangles with vertices in A and with no other points inside or on the sides (see 
Figure 1.2), and its area is 

(2q+p+ 2) I = (q+ | + l)s = ns where n = g+| + leZ. 

(Why is the number of triangles 2q + p + 2? Compute the sum of the angles 
of all the triangles which equals, of course, n times the number of triangles. Every 
point inside the parallelogram contributes 27r to this sum, every interior point of 
a side contributes tt, and the four vertices contribute 2ir. Divide by ir to find the 
number of triangles.) □ 
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Figure 1.2. A dissection of a parallelogram into triangles 

1.6 Proof of Theorem 1.1. Let a,p, and q be as in Theorem 1.1. Consider 
the lattice generated by the vectors v = (—1,0) and w = (a, 1). Then 



pv + qw = (qa - p, q) 



q a 



P 



,q 



We want to prove that for infinitely many (p, q) this point lies within the strip 
— e < x < e shaded in Figure 1.3, left, or, in other words, that the shaded strip 
contains (for any e > 0) infinitely many points of the lattice. 



{qa 



(-1.0) 



y 



■p, q) 



* (M) 




Figure 1.3. Proof of Theorem 1.1 



This is obvious if e is not very small, say, if e — —. Indeed, for every positive 

integer q, the horizontal line y = q contains a sequence of points of the lattice with 
distance 1 between consecutive points; precisely one of these points will be inside 

the wide strip \x\ < —. Hence the wide strip contains infinitely many points of the 

lattice with positive y-coordinates. 

Choose a positive integer n such that — < e and cut the wide strip into 2n 

2n 

narrow strips of width — . At least one of these narrow strips must contain infin- 
2n 

itcly many points with positive y-coordinates; let it be the strip shaded in Figure 
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1.3, right. Let Aq, A\, A2, ■ ■ . be the points in the shaded strip, numbered in the 
direction of increasing y-coordinate. For every i > 0, take the vector equal to AqO 
with the foot point Af, let Bi be the endpoint of this vector. Since OA AiBi is 
a parallelogram and O,A ,Ai belong to the lattice, Bi also belongs to the lat- 
tice. Furthermore, the x-coordinate of Bi is equal to the difference between the 
^-coordinates of Ai and A (again, because OA AiBi is a parallelogram). Thus, 

the absolute value of the x-coordinate of Bi is less that — < e, that is, all the 

2n 

points Bi lie in the shaded strip of Figure 1.3, left. □ 



1.7 Quadratic approximations. Theorem 1.1, no matter how beautiful its 
statement and proof are, sounds rather discouraging. If all numbers have arbitrarily 
good approximations, then we have no way to distinguish between numbers which 
possess or do not possess good approximations. To do better, we can try to work 
with a different indicator of quality which gives more weight to the denominator 

q. Let us now say that approximation - of a is good, if the product q 2 a is 

q q 

small. 

The following theorem, proved a century ago, shows that this choice is reason- 



able. 



Theorem 1.2 (A. Hurwitz, E. Borcl). (a). For any a, there exist infinitely 



many fractions - such that 

q 



< 



V5' 



(b). There exists an irrational number a such that for any A > \/5 there are 
P 

only finitely many fractions - such that 

q 



P 

a 

q 



1 

<A- 



A proof of this result is contained in Section 1.12. It is based on the geometric 
construction of Section 1.6 and on properties of so-called continued fractions which 
will be discussed in Section 1.8. But before considering continued fractions, we want 
to satisfy a natural curiosity of the reader who may want to see the number which 
exists according to Part (b). What is this most irrational irrational number, the 
number, most averse to rational approximation? Surprisingly, this worst number 
is the number most loved by generations of artists, sculptors and architects: the 

golden ratio — - — . 



1.8 Continued fractions. 



2 To be precise, the golden ratio is not unique: any other number, related to it in the sense 
of Exercise 1.8, is equally bad. 
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1.8.1 Definitions and terminology. A finite continued fraction is an expression 
of the form 



a 



oi + 



a 2 + 



1 



On-l + 



where ao is an integer, a\, . . . , a n are positive integers, and n > 0. 

Proposition 1.3. ^4ra/ rational number has a unique presentation as a finite 
continued fraction. 

P 

Proof of existence. For an irreducible fraction -, we shall prove the existence 

q 

of a continued fraction presentation using induction on q. For integers (q = 1), the 

existence is obvious. Assume that a continued fraction presentation exists for all 

P p' 
fractions with denominators less than q. Let r = -, ao = [r . Then r = a H with 

q q 

< p' < q, and r = a H — ? where r' = — t . Since p' < q, there exists a continued 

r 

fraction presentation 

1 

r = ai H j 



fl2 



and, since r' > 1, a\ = [r'\ > 1. Thus, 



fln-1 + 



r - = a + — = a + 
r 



ai + 



a 2 + 



Proof of uniqueness. If 

r = a + 



ai + 



a 2 + 



+ - 



fln-1 H 

a„ 



then 



+ - 



fln-1 H 



a = [r], oi 



^/ ^ 



a 2 
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which shows that ao, a\, a 2l ■ ■ ■ are uniquely determined by r. □ 

The last line of formulas provides an algorithm for computing a , a\, a 2 , . . . for a 
given r. Moreover, this algorithm can be applied to an irrational number, a, in place 
of r, in which case it provides an infinite sequence of integers, 00; 01, 02, • • • , o,% > 
for i > 0. We write 

1 

a = a Q H :j 

01 + 



a 2 + . 

The numbers ao, ai, a 2 , ■ ■ • are called incomplete quotients for a. The number 

1 

r„ = a H 1 



ft! + 



1 

+ 



1 

fln-1 H 

a„. 



is called n-th convergent of a. Obviously, 

r < r 2 < r 4 < • • • < a < ■ ■ ■ < r 5 < r 3 < n . 

The standard procedure for reducing multi-stage fractions yields values for the 
numerator and the denominator of r n : 

a aocii + 1 aoaia 2 + a + a 2 

ra = — , ri = , r 2 = 



1 a\ ' a\a 2 + 1 

where 



or r„ = — where 



Po = a , px = a ai + 1, p 2 = aoaia2 + ao + 02, ••• 
90 = 1, qi = ai, g 2 = aia 2 + l, 

From now on we shall use a short notation for continued fractions: an in- 
finite continued fraction with the incomplete quotients ao,ai,a 2 , . . . will be de- 
noted by [ao; ai, a 2 , . . . ]; a finite continued fraction with the incomplete quotients 
ao, ai, . . . , a n will be denoted by [ao; a\, . . . , a n ]. 

1.8.2 Several simple relations. 

Proposition 1.4. Let ao,ai, . . . ,po,p\, . . ■ ,9o,3i, • • • be as above. Then 

(a) p n = a n p n _i +p n - 2 (n > 2); 

(6) q n = a n q n -i + q n - 2 (n > 2); 

(c) Pn-iQn ~ P n q n -i = (-1)™ {n > 1). 

Proof of (a) and (b) . We shall prove these results in a more general form, when 
ao,a\,a 2 , ■ ■ ■ are arbitrary real numbers (not necessarily integers). For n = 2, we 
already have the necessary relations. Let n > 2 and assume that 

Pn-l = dn-lPn-2 + Pn-3, 
q n -i = a n -iq n - 2 + q n - 3 

for any a ,...,a„_i. Apply these formulas to a' Q = a 0} . . . ,a' n _ 2 = a n ^ 2 ,a' n _ 1 

= a„_i H . Obviously, p[ L = p u q\ = q t for i < n - 2, and p' n _ x = —, q' n _ 1 = —. 
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Thus, 

Pn = a n p'„-i = a n {a' n -iPn-2 + Pn-3) 

1 



a r , 



-1 H Pn-2 +Pn-3 

a, 



= a n {a n -lPn-2 +Pn-S) +Pn-2 
= CLnPn-1 + Pn-2, 

and similarly q n = a n q n -i + q n -2- 

Proof of (c). Induction on n. For n = 1: 

PoQi - PiQo = aoai - (aoai + 1) • 1 = -1. 
If n > 2 and the equality holds for n — 1 in place of n, then 

p n -\q n -p n q n -i =Pn-i{a n q n -i + q n -2) - {a n p n -i +p n -2)q n -i 

= Pn-\q n -2 - Pn-2q n -\ 

= -{Pn-2q n -l -Pn-iqn-2) 

= -(-l)"- 1 = (-1)™. 

□ 

Corollary 1.3. lim r n = a. 

n^oo 

TD ( 1 A A "Pn Pn-1 Pnq n -1 - q n Pn-\ c . 

Proof. Indeed, r n — r n _i = = = . Since 

q n q n -i q n q n -i q n q n -i 

a lies between r„_i and r n , \r n — a\ < , and the latter tends to when n 

q n q n -i 

tends to infinity. □ 

1.8.3 Why continued fractions are better than decimal fractions. Decimal frac- 
tions for rational numbers are either finite or periodic infinite. Decimal fractions 
for irrational numbers like e, 7r or y/2 are chaotic. 

Continued fractions for rational numbers are always finite. Infinite periodic 
continued fractions correspond to "quadratic irrationalities", that is, to roots of 
quadratic equations with rational coefficients. We leave the proof of this statement 
as an exercise to the reader (see Exercises 1.4 and 1.5), but we give a couple of 
examples. Let 

a=[l;l,l,l,...]. /?=[2;2,2,2,...]. 

Then a = 1 + -, [3 = 2 + 4, hence a 2 - a - 1 = 0, [3 2 - 2/3 - 1 = 0, and therefore 
a p 

a = (3 = 1 + V2 (we take positive roots of the quadratic equations). Thus, 

a is the "golden ratio"; also y/2 = (3 - 1 = [1; 2, 2, 2, . . .]. 

1.8.4 Why decimal fractions are better than continued fractions. For decimal 
fractions, there are convenient algorithms for addition, subtraction, multiplication, 
and division (and even for extracting square roots). For continued fractions, there 
are almost no such algorithms. Say, if 

[a ;ai,a 2 , . • .] + [b ; h,b 2 , . . .] = [c ;ci,c 2 , . . .], 

then there are no reasonable formulas expressing Cj's via a^'s and frj's. Besides the 
obvious relations 

[a ; 01, a 2 , . . .] + n = [a + n, 01, a 2 , . . .] (if n e Z) 
[a ;a 1 ,a 2 , • • - = [0;a , 01,02, • • •] (if a > 0), 
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there are almost no formulas of this kind (see, however, Exercises 1.2 and 1.3). 
1.9 The Euclidean Algorithm. 

1.9.1 Continued fractions and the Euclidean Algorithm. The Euclidean algo- 
rithm is normally used for finding greatest common divisors. If M and N are two 
positive integers and N > M, then a repeated division with remainders yields a 
chain of equalities 

N =a M + b , 
M = a 1 b + b 1 
b = a 2 b 1 + b 2 



b n -2 — O-nbn-l 
where all a's and 6's are positive integers and 

< 6„_i < 6„_ 2 < ■ ■ ■ < b < M. 

The number 6 n _i is the greatest common divisor of M and N, and it can be 
calculated by means of the Euclidean Algorithm even if M and N are too big for 
explicit prime factorization. (It is worth mentioning that the Euclidean Algorithm 
may be applied not only to integers, but also to polynomials in one variable with 
complex, real, or rational coefficients.) 

From our current point of view, however, the most important feature of the 
Euclidean Algorithm is its relation with continued fractions. 

Proposition 1.5. (a) The numbers a , a\, . . . , a n are the incomplete quotients 

f N 
° f M' 



N r i 
— = [a ; Oi, . . . , a n \. 

Pi N 
(b) Let — (i = 0, 1, 2, . . . , n) be the convergents of —■ Then bi = (—l) % (Nqi — 

Qi M 

Mpi). 

Proof of (a). 

N b 1 

M ~ ao+ M- ao+ Mjb~ a 

1 1 

= a H ^- = a H ^ — 



a 1 + — ai + 



b h/h 
1 

a H = = a a H 



ai + ^ ax + T 

a 2 + t a 2 + 



h 61/62 
= • • • = [«o; 0,1,..., a n \. 

Proof of (b). For i = 0, 1, the statement is obvious: 

b = N-Ma =Nq Q -Mp ; 

b x =M- a x b {) = M — Na 1 + Ma ai = M(a a 1 + 1) - Na 1 
= -(Nq 1 -Mp 1 ). 
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(-l) l [Nq t _ 2 - M Pi _ 2 + OiiNq^ - M Pi _!)] 
(-l) J [7V(a igi _i +q t - 2 ) - M(aiPi_i +Pi- 2 )j 
(-^(JVgi-MpO- 

All of the above can be applied to the case when the integers N, M are replaced 

by real numbers (3, 7 > 0. We get an infinite (if — is irrational) sequence of 

7 

equalities, 

/? =a 7 + 6 , 
7 =0160 + 61, 
60 = a 2 6i+6 2 , 



Then, by induction, 

bi = h_ 2 - a i b l _ 1 = 

□ 



where a is an integer, a 1,02,... are positive integers, and the real numbers 6* 
satisfy the inequalities 

< • • • < 6 2 < 61 < 6 < 7- 
Proposition 1.5 can be generalized to this case: 

Proposition 1.6. (a) — = [o ; 01,02, . . .]. 
7 

(b) if — is the i-th convergent of —, then bi = (— l)*(7<7i — (3pi). 
(The proof is the same as above.) 

1.9.2 Geometric presentation of the Euclidean Algorithm. It is shown in Figure 

1.4. 

Take a point O in the plane and a line £ through it (vertical in Figure 1.4). 
Take points A_ 2 and A-i at distance (3 an 7 from £, both above the horizontal 
line through O: A_ 2 to the right of I and A_\ to the left of I. Apply the vector 

OA-i to the point A_ 2 as many time as possible without crossing the line i. Let 

> 

A be the end of the last vector, thus the vector A D crosses £. Then apply 
> 

the vector OA to the point A-\ as many time as possible without crossing £; 

let Ai be the end of the last vector. Then apply the vector OA\ to A and get 

the point A 2 , then A 3 , A 4 (not shown in Figure 1 .4) , etc. We get two polygonal 

lines A- 2 AqA 2 A4 . . . and A-1A1A3 . . . converging to t from the two sides, and 
> > > > > > 

A- 2 Ao = aoOA_i,A_iAi = aiOA , A A 2 = a 2 OA\, etc. This construction is 
related to the Euclidean algorithm via the column of formulas shown in Figure 1.4. 

In particular, — = \a a ; ai , a 2 , . . . 1 . 

7 

Notice that if some point A n lies on the line £, then the ratio — is rational and 

7 

equal to [a ; ai, a 2 , . . . , a n ]. 

The following observation is very important in the subsequent sections. All 

the points marked in Figure 1.4 (not only ^4_ 2 , A_i, Aq, A\, A 2 , but also B,C,D) 

> > 

belong to the lattice A generated by the vectors OA_ 2 and OA_\. Indeed, consider 
the sequence of parallelograms 

A_!A_ 2 B, A_ x OBC, A-xOCAo, A_ t OA D, DOA A U A 1 OA A 2 ,... 
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= a ■ OAJ x 
= ai ■ OAq 
a 2 ■ OJ[ 



dist(^_ 2 ,^) = (3 
dist(,4_i,£) = 7 
dist(A ,£) = b 
dist(^i, I) = b x 
dist(^ 2 ,^) = b 2 



P = ao7 + b 
7 = arfo + &i 
bo = a 2 b\ + b 2 



Figure 1.4. Geometric presentation of the Euclidean Algorithm 



Since A-i,0, A- 2 are points of the lattice, we successively deduce from Proposition 
1.1 that B, C, A , D, Ai,A 2 , . . . are points of the lattice. 
Moreover, the following is true. 

Proposition 1.7. No points of the lattice A lie between the polygonal lines 
A_ 2 A A 2 A 4: . . . and A_iAiA 3 . . . (and above A_ 2 and A_\). 

Proof. The domain between these polygonal lines is covered by the parallelo- 
grams OA_ 2 BA_ 1 , OBCA-!, OCA A_ X , OA DA_ U OA^D, OA A 2 A U OA 2 EA 1 
(the point E is well above Figure 1.4), etc. These parallelograms have equal areas 
(every two consecutive parallelograms have a common base and equal altitudes). 
Thus all of them have the same area as the parallelogram OA- 2 BA-i, and Propo- 
sition 1.2 (b) states that no one of them contains any point of A.D 

(By the way, the polygonal lines A- 2 AqA 2 A^ . . . and A-iAiA 3 . . . may be 
constructed as "Newton polygons". Suppose that there is a nail at every point of 
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the lattice A to the right of £ and above A- 2 - Put a horizontal ruler on the plane 
so that it touches the nail at A_ 2 and then rotate it clockwise so that it constantly 
touches at least one nail. The ruler will be rotating first around A_ 2 , then around 
Ao, then around A 2 , etc, and it will sweep the exterior domain of the polygonal 
line A- 2 A Q A 2 A 4 . . . .) 

1.10 Convergents as the best approximations. Let a be a real number. 
In Section 1.6, we considered a lattice A spanned by the vectors (—1,0) and (a, 1). 
For any p and q, the point 

p(-l, 0) + q(a, 1) = (qa -p,q)= (q (a - ^ ,q 

P 

belongs to the lattice; our old indicator of quality of the approximation - of a was 
equal to the distance of this point from the y axis. The new indicator of quality, 



? 2 



, is the absolute value of the product of coordinates of this point. So, the 



P 

question, for how many approximations — of a this indicator of quality is less than 

q 

e, is equivalent to the question, how many points of the lattice A above the x axis 
(q > 0) lie within the "hyperbolic cross" \xy\ < e (Figure 1.5). 




Figure 1.5. Lattice points in the "hyperbolic cross" 



Let us apply the construction of Subsection 1.9.2 to the lattice A with A- 2 = 
(a, 1) and A-i = (—1,0). What is the significance of the points Ao, A\,A 2 , ■ ■ . ? 

Proposition 1.8. For n > 0, A n = (q n a — p n ,q n ) where p n and q n are the 
numerator and denominator of the irreducible fraction equal to the n-th convergent 
of the number a. 

Proof. Induction on n. For n = 0, 1 we check this directly: since po = a Oj<Zo = 
l,Pi= aoai + 1,(71 = ao (see Section 1.8), 

Ao = A- 2 + a A- 1 = (a,l) + a (-l,0) 

= (a-a ,l) = (q a-p ,qo), 
A 1 =A_ 1 + a 1 A = (-l,0) + oi(a-o ,l) 

= (oia - (o oi + 1), ai) = (qia - pi,qi). 
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Furthermore, if n > 2 and the formulas for ^4„_i and A„_2 are true, then 

A n = A n _ 2 + a n A n _ 1 = (q n -20l ~ Pn-2,Qn-2) + a n (Qn-lOt - Pn-l,Qn-l) 

= ((a n q n -i + q n -2)ot - a n p n -i -p„_ 2 ,a„g„_i + q>„_ 2 ) = {q n & - p n ,q n )- 

□ 

Proposition 1.8 shows that convergents are the best rational approximations of 
real numbers. In particular, the following holds. 



Pti 

Proposition 1.9. Let e > 0. If for only finitely many convergents —, q\ 

q-n 



Pn 

q n 



< 



P 

e, then the whole set of fractions - such that q 2 

q 



a 



< e is finite. 



Proof. The assumption implies that for some n, all the points A n+ i, A n+2 , 
A„ +3 , A„ +4 lie outside the hyperbolic cross \xy\ < s. This means that the whole hy- 
perbolic cross lies between the polygonal lines A n+ iA n+3 A n+ 4 . . . and A n+ 2A n+ 4A n+ Q . . 
(we use the convexity of a hyperbola: if the points Ak and A^+2 lie within a com- 
ponent of the domain \xy\ > e, then so does the whole segment AkAk+ 2 )- But 
according to Proposition 1.7, there are no points of the lattice between the two 
polygonal lines (and above A n ). Thus the hyperbolic cross \xy\ > e contains no 
points of the lattice above A n , whence the proposition. □ 



Notice that the expression g s 



P 

a 

q 



is not very important for this proof. The 



same statement would hold for the indicator of quality calculated as q 3 
P 



P 

a 

q 



or 



^100 



or, actually any expression F I q, 





p 






a 


) 




q 





where the function F has 



the property that the domain F{x 1 y) > e within the first or the second quadrant 
is convex for any e. 

Thus, convergents provide the best approximations. For example, for the golden 
1 ~\~ \/5 

ratio — - — = [1; 1, 1, 1, ... ], the best approximations are 

i, [i; i] = f , [i;M] = | [l; = | [i;i,i,i,i] = §,--■; 

these are the ratios of consecutive Fibonacci numbers (which follows from Propo- 
sition 1.4(b)). For \[2 = [1; 2, 2, 2, ... ] the best approximations are 

1, [1;2] = §, [1;2,2] = 1 [1;2,2,2] - H [1;2,2,2,2] = ^, 

99 47321 
[1; 2, 2,2, 2,2] = -,..., [1; 2, 2,2, 2, 2, 2,2, 2,2,2,2, 2] = ^-,.... 

We mentioned the last two approximations of y/2 in Section 1.2; in particular, we 
99 

stated that — is the best approximation for \f2 among the fractions with two-digit 
denominators. 

What is most surprising, there exists a beautiful formula for the indicators of 
quality of convergents. 
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a 



1.11 Indicator of quality for convergents. 

Theorem 1.4. Let — be the (irreducible) n-th convergent for the real number 

Qn 

- [a ;ai,a 2 , . . .]. Then 



a — 



(In 



1 

An 



where 



A n — &n+l + 



a n +2 + 



a n +3 + 



a n -i + 



■ 1 
+ — 

ai 



The proof is based on the following lemma. 

Lemma 1.10. Let points A, B have coordinates (a\, 02), (— b\, b 2 ) in the standard 
rectangular coordinate system with the origin O where 01,02,61,62 are positive. 
Then the parallelogram OACB (see Figure 1.6, left) has the area aib 2 + bia 2 . 




Figure 1.6. Computing the area of a parallelogram 



Proof of Lemma. Add to the parallelogram (Figure 1.6, left) vertical lines 
through A and B and a horizontal line through C. We get a pentagon OAFDB 
(see Figure 1.6, right). Divide it into 7 parts as shown in the figure and denote 
by Si the area of the part labeled i. Obviously, EF = GA = ai,DB = GD = 
a 2 , AF = OH = b 2 . It is also obvious that S4 = S2 + S$ and £1 = SV- Thus, 

a,re&(OACB) = S 3 + 5 4 + S 6 + S 7 = S 3 + (S 2 + S 5 ) + S 6 + £1 
= (£1 + S 2 + S 3 ) + (S 5 + S 6 ) 
= aic&{HEDB) + aie&(AFEG) = b x a 2 + aib 2 . 

□ 
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Figure 1.7. Proof of Theorem 1.4 



Proof of Theorem. Consider Figure 1.7, left. It corresponds to the case when 
n is even. We shall use the notation = \aqu — Pk\- The coordinates of the points 
A_ 2 , A-i, A n _i, A n are (a, 1), (-1, 0), (-r„_i, g„-i), (r„, ?„) (sec Proposition 1.8). 
The area of the parallelogram OA n EA n ^\ is 1 (see Proposition 1.7 and its proof). 
We have the following relations: 

(1) r n _ig„ + r„g„_i = 1; 

(2) = [a n+ i; a„ +2 , a„ +3 , . . .]; 



(3) 



<7n-l 



[fflnj a n-l, . . . , OiJ. 



Relation (1) is stated by Lemma 1.10. Relation (2) follows from Proposition 

1.6 (a) (the Euclidean Algorithm for — — is part of the Euclidean Algorithm for — 

r n 1 
as presented in Figure 1.4). Relation (3) may seem less obvious, but it also follows 

from Proposition 1.6 (a). To see this, reflect the points A n ,A n _ 2 , ■■ - ,Aq in the 

origin as shown in Figure 1.7, right. We get a picture for the Euclidean Algorithm 



for 



q„ 



(turned by 90° and reflected in the x axis). The polygonal lines similar 



to A-2A0A2A4 . . . and A-iA-lA^As ... are A' n A' n _ 2 ...A' and An^A,^ . . . 
The second one ends at a point A_i on the x axis which means (as was noticed in 

Subsection 1.9.2) that — — is a finite continued fraction [a n ; a n -\, a n _2, ■ • ■ , ai], 
q n -i 

as stated by Relation (3). 

Now, we divide Relation (1) by r n q n and compute A„: 



An 



1 



[a n +i; a n +2 ; 



,ai 



Also, 



-iq n 



+ 



[a„; a„_i, . . . , ai] + 



[O-n+l] O-n+2, 
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This completes the proof of the theorem both for even and odd n.D 

Theorem 1.4 shows that while convergents are the best rational approximations 
of real numbers, they are not all equally good. The approximation — is really 

In 

good if A„ is large, which means, since a n+ \ < A n < a„ +1 + 2, that the incomplete 
quotient a n +i is large. In this sense, neither the golden ratio, nor \pl have really 
good approximations. Let us consider the most frequently used irrational numbers, 
n and e. It is not hard to convert decimal approximations provided by pocket 
calculators into fragments of continued fractions (we shall discuss this in detail in 
Section 1.13). In particular, 

tt= [3:7,15,1,293, 10,3,8,...], e - [2; 1, 2, 1, 1, 4, 1, 1, 6, . . . ]. 

We see that, unlike e, tt has some big incomplete quotients; the most notable are 
15 and 293. The corresponding good approximations are 

[3;7] = f, [3; 1,15,1] = 

The first was known to Archimedes; with its denominator 7, it gives the value of 
tt with the error 1.3 • 1CP 3 . The second one was discovered almost 4 centuries ago 
by Adriaen Metius. It has a remarkable (for a fraction with this denominator) 
precision of 2.7 • 10 -7 and gives 6 correct decimal digits of n. Nothing comparable 
exists for e: the best approximations (within the fragment of the continued fraction 

19 199 
given above) are — (the error is w 4 • 1CP 3 ) and (the error is w 2.8 • 1CP 5 ). 

For further information on the continued fraction for tt and e, see [56], Appendix 
II. 

1.12 Proof of the Hurwitz-Borel Theorem. Let a — [do; ai, <i2, • ■ •] be 
an irrational number. We need to prove that for infinitely many convergents — , 

Qn 

K = — 7 — r > \/5, 

and this is not always true if y/5 is replaced by a bigger number. 

Case 1. Let infinitely many a„'s be at least 3. Then, for these n, 

A„_i > a n > 3 > VE. 

Case 2. Let only finitely many a„'s be greater than 2, but infinitely many of 
them equal to 2. Then, for infinitely many n, a„+i = 2, a n < 2, a n+2 < 2, and 

1 1 1 1 8 r- 

A„ = a„+i + — + — >2+- + - = ->V5. 

a n +2 H a n -\ 

Case 3. For sufficiently big m, a m = 1. Then, for n > m, 
A„ = [l;l,l,l,...] 



[l;l,l,...,ai] 



~\~ 1 

The first summand is the golden ratio, , the second summand tends to 

( ~ ] = ~~7^ — ~ w fi cn n —> 00 and is greater than — - for every other n. 
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Thus, A„ > ^77- - + ——z — - = V5 for infinitely many n, but, since lim A„ = y/E, 
for any e > 0, the inequality A„ > \/5 + e holds only for finitely many n. □ 

Comments. It is clear from the proof that only in Case 3 can we not replace 
the constant \/5 by a bigger constant. In this case the number a has the form 
[a ; a 1; . . . , a„, 1, 1, 1, . . . ]. The most characteristic representative of this class is 
the golden ratio 

^ = [!;!,!, !,..]• 

One can prove that all numbers of this class are precisely those of the form 



cp + d 

with a,b,c,d e Z and ad — be = ±1. If a is not one of these numbers, then the 
constant can be increased to V&>. There are further results of this kind (see 
Exercises 1.11-1.14). 

In conclusion, let us mention the following theorem, for which its author, Klaus 
Roth, was awarded a Fields medal in 1958. 

Theorem 1.5 (Roth). If a is a solution of an algebraic equation 

a n x n + a n _ix n ~ x H h a x x + a Q = 

with integral coefficients, then for any e > 0, there exist only finitely many fractions 
- such that 

q 

1 

< 



q 2+e- 



1.13 Back to the trick. In Section 1.3, we were given two 9-digit decimal 

fractions, of which one was obtained by a division of one 3-digit number by another, 

while the other one is a random sequence of digits. We need to determine which is 

P 

which. If a is a 9-digit approximation of a fraction — with a 3-digit denominator 

q 

q, then 



1 1 1 

< -rrrrr = — — < 



10 9 1000 -(1000 2 ) 1000g 2 ' 
By Theorem 1.4, this means that one of the incomplete quotients a n +i of a is 
greater than 1000, and the corresponding q n is less than 1000. How big can n be? 
Since q n — a n q n -i + q n -2, the numbers q n grow at least as fast as the Fibonacci 
numbers F n . Since Fi 5 = 987, n should be at most 15. 

It is very easy to find the beginning fragment of the continued fraction for a 
given a: 

[a] = a ; 
(a - a ) _1 = «i, [ai] = ai; 
(ai - ai) _1 = a-2, [a 2 ] = a 2 ; 
(a 2 - a 2 )~ 1 = a 3 , [a 3 j = a 3 ; 



Using this algorithm, we can find a few incomplete fractions of the two numbers 
given in Section 1.3 relatively fast: 

0.635149023 = [0; 1, 1, 1, 2, 1, 6, 13, 1204, 1, . . . ] 

0.728101457 = [0; 1,2, 1, 2,9, 1, 1, 1, 1,3, 1, 15, 1,59,7, 1, 39, . . .] 
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Pi 


= a ai + 1 = 1, 


P2 = 


1 


Pi +P<3 


= 1, 


P3 


= 1 -p 2 +pi = 2, 


Pi = 


2 


P3 + P2 


= 5, 


P5 


= 1 ' P4 + P3 = 7, 


Pa = 


G 


P5 + Pi 


= 47, 


Pi 


= 13-p 6 +p 5 = 618. 










qi 


= a\ = 1, 


92 = 


1 


9i +9o 


= 2, 


93 


= 1 • 52 + 9i = 3, 


94 = 


2 


93 + 92 


= 8, 


95 


= 1 • 94 + 93 = 11, 


96 = 


G 


95 + 94 


= 74, 


q- 


= 13 • g 6 + g 5 = 973. 











Obviously, the first number, and not the second one, has a very good rational 
approximation, namely [0; 1, 1, 1,2, 1,6, 13]. 

Next step: using the relations from Proposition 1.4 (a-b), we can find the 
corresponding convergents: 

Po = «o = 0, 



9o = 1, 



618 

Final result: the first number is rational, it is — — (to be on the safe side, you 

y / o 

can divide 618 by 973 using your calculator, and you will get precisely 0.635149023). 

1.14 Epilogue. Bob (enters through the door on the right): You were right, 
a calculator cannot give a proof that \/2 is irrational. 

Alice (enters through the door on the left): No, you were right: using a calculator, 

r- 25 

you can certainly distinguish between numbers like y2 and — . 

Bob: Yes, but still it is not a proof of irrationality. I read in a history book that 
when Pythagoras found a proof that \[2 was irrational, he invited all his friends to 
a dinner to celebrate this discovery. 

Alice: Well, we shall not invite all our friends, but let's have a nice dinner now. My 
pie is ready. 

Bob: Oh, pi! There is a wonderful approximation for pi found by Metius! 
Alice: But I don't mean this pi, I mean my apple pie. 

Bob: Then let's go and try it. (They leave together through a door in the middle). 



John Smith 
January 23, 2010 



Martyn Green 
August 2, 1936 



Henry Williams 
June 6, 1944 



1.15 Exercises. 

1.1. (Pick's Formula.) Let P be a non-self-intersecting polygon whose vertices 
are points of a lattice with the area of the elementary parallelogram s. Let m be the 
number of points of the lattice inside P and n the number of points of the lattice 
on the boundary of P (including the vertices). Prove that 



area(P) = yn + — — lj s- 
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Hint. Cut P into triangles with vertices in the points of the lattice and without 
other points of the lattice on the sides and inside and investigate how the right 
hand side of the equality behaves. 

1.2. Prove that 

, , _ J [-l-ao;l,ai-l,a 2 ,...], if ai > 1, 

-[00,01,02,...]- | hl _ ao;a2 + lja3j ... L if 0l = 1 

1.3. (a) Prove that if 0,0,0,2,0,4,0,6, .. . are divisible by n, then 



[a ;ai,O2, . . .] 



a a 2 
— ; nai, — , na 3 , 
n n 



n 

(b) Prove that if ai, 03, 05, . . . are divisible by n, then 

n[ao;ai,a 2 , . . .] = [na , — ,na 2 , — , . . .]. 

n n 

1.4. Assume that 

a = [o ;ai,a 2 . ..] 

is a periodic continuous fraction, that is, for some r > and d > 0, a m+ d = a TO for 
all m> r. Prove that a is a root of a quadratic equation with integer coefficients. 
Hint. Begin with the case r = 0. 

1.5. ** Prove the converse: if a is a root of a quadratic equation with integer 
coefficients, then a is represented by a periodic continued fraction. 

1.6. Find the continued fractions representing \/3, \/5, V»i 2 + 1, \/n 2 — 1. 

1.7. Using Exercises 1.6 and 1.3, find the continued fractions representing 

4v/5, and 

'2 2 

1.8. (Preparation to Exercise 1.9.) Let a, (3 be real numbers. We say that a 

is related to (3, if a = ^ — where a, b, c, d are integers and ad — 6c = ±1. Prove 
cp + a 

that if a is related to /?, then /3 is related to a. Prove also that if a is related to (3 
and /3 is related to 7, then a is related to 7. 

1.9. *Lct 

a= [a ;a 1 ,a 2 ,...],(3= [b ; h, b 2 , ■ ■ ■ } 
be "almost identical" continued fractions, that is, there are non-negative integers 
k, t such that Ok+m = be+ m for all m > 0. Prove that a and j3 arc related. 

1.10. * Prove the converse: if a and (3 arc related, then their continued fractions 
are almost identical (see Exercise 1.9). 

Hint. The following lemma might be useful. If a and (3 are related, then there 
is a sequence of real numbers, ao, a\, . . . , ajv such that a — a, a/v = (3, and for 
I <i<N, 

ai = -Cij-i, or atj = aj_i + 1, or a 4 = . 

1.11. Prove that if a is not related to the golden ratio (that is, 

[a ;ai,a 2 ,...,a r , 1,1,1,...]), 
then y/E in the Hurwitz-Borel Theorem can be replaced by y/8. 
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1.12. Prove that if a is not related to the golden ratio or to ^/2, then \/h in 

[221 

the Hurwitz-Borel Theorem can be replaced by y "25"' 

Remark to Exercises 1.11 and 1.12. The reader can extend the sequence Vb, \/&, 

as long as he wishes (so, if a is not related to the golden ratio, y/2 and one more 
specific number, then y/5 in the Hurwitz-Borel Theorem can be replaced by a still 
bigger constant, and so on.) The resulting sequence will converge to 3. 

1.13. Prove that there are uncountably many real numbers a with the follow- 

P 

ing property: if A > 3, then there are only finitely many fractions — such that 

P 1 
a - - < —. 

q Xq 2 
Hint. Try the numbers 

[1;1,1,.. .,1,2,2,1,1,.. .,1,2,2,1,1,.. .,1,2,2,1,...] 

S v ' S v ' S v ' 

n tli "2 

where no, n\, 712, ■ ■ ■ is an increasing sequence of integers. 

1.14. ** The number 3 in Exercise 1.13 cannot be decreased. 

1.15. Find the smallest number A„ with the following property. If a = [a ; a\, a 2 , . 
and dfc > n for k sufficiently large, then for any A > A„ there are only finitely many 

V ^ 1 

q Xq 2 



P 



fractions - such that 
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Arithmetical Properties of Binomial Coefficients 



2.1 Binomial coefficients and Pascal's triangle. We first encounter bino- 
mial coefficients in the chain of formulas 



(a + b)° 
(a + b) 1 
(a + b) 2 
(a + b) 3 
(a + b)* 



= 1 

= a + b 
^ 2 1 2ab- 



a 
a 3 

a 4 + 4a 3 b+6a 2 b 2 + Aab 3 + b A 



3a 2 b+3ab 2 + b 3 



as the coefficients in the right hand sides. The coefficient of a m b 

' n 



m < n) is denoted by 



m 



(where < 

(or, sometimes, by C™) which is pronounced as "n 



choose m" (we shall explain this below). There are two major ways to calculate 

the numbers [ ] . One is given by the recursive Pascal Triangle Formula: 
\m) 



n-lWn-! 
m — lj \ m 



which has a simple proof: 



m 



a m b n-m + . . . = ( fl + & )n = ( fl + ft)n-l( ffl + 



+ 



n — 1 
m — 1 



m-lin-m 



n- 1 



a m 6"- ro " 1 



+ 



(a + b) 



■ + 



n — 1 



m 



+ 



n — 1 
m 



min—m 



a m b 



+ ... 
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The second expression for the binomial coefficients is the formula 

/ n\ n(n — 1) . . . (n — m + 1) n\ 

\mj 1-2 m m\(n — m)\ 

which can be deduced from Pascal Triangle Formula by induction: it is obviously 

true for n = 0, and if it is true for ^ (for all k between and n — 1), then 

/n\ _/n-l\ (n-l\_ (n-1)! (n-1)! 
\to/ \m—lj \ m J (m — l)!(n — m)! m!(n — to— 1)! 

(n — 1)! • to + (n — 1)! • (n — to) (n — 1)! • n n! 

m!(n — to)! m!(n — m)! m!(n — to)! 

The Pascal Triangle Formula gives rise to the Pascal Triangle, a beautiful trian- 
gular table which contains all the binomial coefficients and which can be extended 
downward infinitely. 

1 

1 1 
12 1 
13 3 1 
1 4 6 4 1 
1 5 10 10 5 1 
1 6 15 20 15 6 1 
1 7 21 35 35 21 7 1 
1 8 28 56 70 56 28 8 1 
1 9 36 84 126 126 84 36 9 1 
1 10 45 120 210 252 210 120 45 10 1 
1 11 55 165 330 462 462 330 165 55 11 1 
1 12 66 220 495 792 924 792 495 220 66 12 1 
1 13 78 286 715 1287 1716 1716 1287 715 286 78 13 1 
1 14 91 364 1001 2002 3003 3432 3003 2002 1001 364 91 14 1 



In this table, the n-th row (the top row with just one 1 has number 0) consists 
of the numbers 



^0/ \l J \n — 1 J \ny 
The Pascal Triangle Formula means that every number in this table, with the 
exception of the upper 1, is equal to the sum of the two numbers above it (for 
example, 56 in the 8-th row is 21 + 35). Here we regard the blank spots as zeros. 

To legalize the last remark, we assume that ( U ) is defined for all integers 

W 

n, to, provided that n > 0: we set ( I = 0, if to < or m > n. This does 

W 

not contradict the Pascal Triangle Formula (provided n > 1), so we can use this 
formula for any to. 

Let us deduce some immediate corollaries from the Binomial Formula 
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Proposition 2.1. (a) 
(b) Ifn > 1, tften 



o + 1 + - + u-i + : 



(c) If n> 1, i/ien 

Proof. The Binomial Formula yields (a) if one takes a = 6 = 1 and yields (b) if 
one takes a = 1,6= — 1. The formula (c) follows from (a) and (b). □ 

2.2 Pascal Triangle, combinatorics and probability. 

PROPOSITION 2.2. There are ( ] ways to choose to things out of a collection 

W 

of n (different) things. 

Remark 2.3. (1) This Proposition explains the term "n choose to". 
(2) If to < or m > n, then there are no ways to choose to things out of n. This 

fact matches the equality I ) = for to < or to > n. 

w 

Proof of Proposition. Again, induction. For n — 0, the fact is obvious. Assume 
that the Proposition holds for the case of n — 1 things. Let n things be given 
(n > 1). Mark one of them. When we choose to things out of our n things, we 
either take, or do not take, the marked thing. If we take it, then we need to chose 

to — 1 things out of the remaining n — 1; this can be done in ^ ways. If we 

do not take the marked thing, then we need to choose to things out of n — 1, which 

can be done in ( j ways. Thus, the total number of choices is 



TO 



n — 1 \ / n—l\ In 

TO — 1/ \ TO / \ TO 



and we arc done. □ 



As an aside, this Proposition has immediate applications to probability. For 
example, if you randomly pull 4 cards out of a deck of 52 cards, the probability to 
get 4 aces is 

1 4! ' 48! 1 
52\ -.2! 27072.1 

(there are (^^j choices, and only one of them gives you 4 aces). The probability 
of getting 4 spades is higher: it is 
13 . 

13! -4! -48! 11 „ nA ,„ , 
2.64 • 1CT 3 



52\ 4! -9! -52! 4165 
4 
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(the total number of choices of 4 cards is [ J , the number of choices of 4 spades 



4 



2.3 Pascal Triangle and trigonometry. The reader is probably familiar 
with the formulas 

sin 20 = 2 sin cos 0, 

cos 20 = cos 2 9 - sin 2 9. 
And what about sin 39? cos 56*? sin 120? All such formulas are contained in the 
Pascal Triangle: 

cos 06= 1 

cos 19= . -, n 1 cos 9 ^ . n 

sin If = 1 sm9 

cos 29= ■ r,n 1 cos 2 9 „ n ■ n — 1 sin 2 

sin 29 = 2 cos 9 sin 9 

cos 30= . o/i 1 cos 3 9 „ o /i • /i — 3 cos sin 2 - -3/1 

sin 36<= 3 cos sin — 1 sin 

cos 40= . .„ 1 cos 4 . 3 /i • /i — 6 cos 2 sin 2 . „ . 3 n +1 

sin 40 = 4 cos sin —4 cos sin 



Can you see the Pascal Triangle here? It is slightly spoiled by the signs. Here 
is the result. 

Proposition 2.4. 

„ti I 1 J cos™- 1 0sin0 - Q cos"- 3 0sin 3 9 + Q cos"" 5 0sin 5 - . . . 

cos n0 = cos" 9 - Q cos"~ 2 sin 2 + Q cos"" 4 sin 4 - . . . 

Proof. Induction, as usual. For n = 1, the formulas are tautological. If the 
formulas for sin(n — 1)0 and cos(n — 1)0 (n > 1) hold then 

sin n9 = sin((n — 1)0 + 0) = sin(n — 1)0 cos + sin0cos(n — 1)0 

U ~ ^ cos"" 2 sin - ~ ^ cos"" 4 sin 3 + . . . ^ cos 0+ 
sin0 I ( " I cos" - ^ 61- I 2 I cos" "siir 



n — 1 \ fn—1 

M 1 

n — 1 \ / n — 1 



cos" 1 sin ( 



2 / V 3 



cos"" 3 9 sin 3 0+ ... 



"j cos"- 1 sin - cos"- 3 sin 3 + . . . , 

and similarly for 

cos n9 = cos((n — 1)0 + 0) = cos(n — 1)0 cos — sin(n — 1)0 sin 0, 
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as needed. □ 

There is also a formula for tan nd (generalizing the textbook formula for tan 26) : 



tan n6> 



" J tan 6 - Q j tan 3 6 + Qj tan 5 6-... 
iairfl- I ") tan 4 6- ( " ) tan 6 6> ... 



2/ V47 V6 



(see Exercise 2.1). 

These applications, however, do not represent the main goal of this lecture. 
We will be interested mainly in arithmetical properties of binomial coefficients, like 
divisibilities, remainders, and so on. 



• o • 
• • • • 

• o o o • 
• • o o • • 
• o • o • o • 

• ooooooo* 

• •oooooo«« 
• o«ooooo«o« 

• • • • o o o o • • • • 
• ooo«ooo«ooo« 

• • o o • • o o • • o o • • 
• o • o • o • o • o • o • o • 

ooooooooooooooo 
oooooooooooooo 



2.4 Pascal triangle mod p. Let us take the Pascal triangle and replace 
every odd number by a black dot, •, and every even number by a white dot, o. The 
resulting picture will remind the Sierpinski carpet (for those who know what the 
Sicrpinski carpet - a.k.a. Sierpinski gasket is). 

A close look at this picture reveals the following. Let 2 r < n < 2 r+1 . Then 

(1) if m < n — 2 r , then ) has the same parity as ( 

\m I \ m 



Tb\ I Ti — 2 

(2) if m > 2 r , then ( I has the same parity as 



m 

(3) if n — 2 r < m < 2 r , then I U \ is even. 



m-2 r l' 



V TO 7 

The following result generalizes these observations to the case of an arbitrary 
prime p. 
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Theorem 2.1 (Lucas, 1872). Let p be a prime number, and let n,m,q,r be 
non-negative numbers with < q < p, < r < p. Then 



pn + q\ = f n \f<l 
pm, + r J \m J \r 



mod p. 



(We assume the reader is familiar with the symbol =. The formula A = B mod 
N, "A is congruent to B modulo N" means that A — B is divisible by N, or A 
and B have equal remainders after division by N. We shall also use this symbol for 
polynomials with integral coefficients: P = Q mod N if all the coefficients of the 
polynomial P — Q are divisible by N .) 

To prove the theorem, we need a lemma. 

Lemma 2.5. IfO<m<p, then (^^j * s divisible by p (and not divisible by p 2 ; 
but we do not need this). 
Proof of Lemma. 

' p\ p(p — 1) . . . (p — m + 1) 



m) 1-2 m 

and no factor in the numerator and denominator, except p in the numerator, is 
divisible by p. □ 

Proof of Theorem. The Lemma implies that (a+b) p = a p +b p mod p. Therefore, 
(a + b) pn+q = ((a + b) p ) n {a + b) q = (a p + b p ) n (a + b) q mod p, 

(a p + b p ) n (a + b) q = [a pn + ■■■+ Q a pmyp(n-m) + . . . + b Pr 



a r b q ~ r + --- + b q 



and it is clear that the term a pm+r b p ( n m )+(<J r ) appears in the last expression only 



ml \r 



once and with the coefficient I I ( ^ ) , whence 



and we are done.D 



pn + q\ = f n \((l 
pm + rj \m J \r 



mod p, 



To state a nice corollary of Lucas' Theorem, recall that, whether p is prime or 
not, every positive integer n has a unique presentation as n r p r + n r _ip r_1 + • • • + 
nip + n n with < n r < p and < m < p for i = 0, 1, . . . , i — 1. We shall use the 
brief notation n = (n r n r _i . . . mno) p . The numbers rii are called digits of n in the 
numerical system with base p. If p = 10, then these digits are usual ("decimal") 
digits. Examples: 321 = (321)i = (2241) 5 = (101000001) 2 . Note that we can use 
the presentation of numbers in the numerical system with an arbitrary base to add, 
subtract (and multiply; and even divide) numbers, precisely as we do this using the 
decimal system. 

Let us return to our assumption that p is prime. 
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Corollary 2.2. Let n = (n r n r -i . . . n\no) p , m = (m r m r _i . . . mimo) p (we 
allow m r to be zero). Then 



n \ _ I n r \l n r _i \ / m \ t n 
m) \m r J \m r -ij \ m i/ V TO o 



mod p. 



Proof. Induction on r. The case r = is obvious; assume that our congruence 
holds if r is replaced by r — 1. Then n = pn' + n , m = pm! + m where n' = 
(n r n r -i . . . n 2 ni)p, m' = (m r m r _i . . . m 2 mi) p . Lucas' Theorem and the induction 
hypothesis give, respectively, 

n\ ( n' \ ( n \ 

mod p, 
mod p, 
mod p, 



m J \m' J \rrio 



n \ __ / n r \ I m 
m'J \m r J ' ' ' \mi 



whence 



mj \m r J \miJ\mo 



and we arc done. □ 



This result shows that binomial coefficients have a tendency to be divisible by 
prime numbers: if at least one trij exceeds the corresponding rij then the product on 
the right hand side of the last congruence is zero. Example: what is the remainder 

of j modulo 3? Since 31241 = (1120212002) 3 and 17101 = (0212110101) 3) 



O Ko)G)(0G)(DG)(o)G)(o)G 



= l- 0- 2- 0- 2- l- l- 0- l- 2 = mod 3. 
On the other hand, 31241 = (1444431) 5 , 17101 = (1021401) 5 , and 
/31241\ /1\ /4\ /4\ /4\ /4\ /3\ [I s 



= l- l- 6- 4- l- l- l = 24 = 4 mod 5. 

Note in conclusion that Corollary 2.2 explains the observations made in the 
beginning of this section. If 2 r < n < 2 r+1 , then n = (ln r _i . . . n\no)2 (n, = 
or 1 for i = 0, . . . , r — 1). If m < n — 2 r , then m = (0m r _i . . . toiTOo)2 and 

'n\ _ /1\ /n r _i\ /n \_/n r _i\ /n \ = /n-2 rN 
y m) ~ \0J \m r _i/ ' ' ' \m ) \m r -\) ' ' ' \m J ~ \ m 
If m > 2 r , then m = (lm r _i . . . mimo)2 and 

'n\ _ /1\ /n \_/n r _i\ /n \ = /n-2 r 

V TO/ ~ U/ \ m r-l/ ' ' ' Uo/ \ m r-l/ ' V«W _ \ m - 2r 



mod 2. 



mod 2. 



/ Tl ' 

If n — 2 r < m < 2 r , then > for at least one i < r — 1. In this case * 

and ( ]=----0---- = mod 2, and so ( (is even. 
Km) \m 
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2.5 Prime factorizations. Let us begin with the following simple, but beau- 
tiful, result. 

Theorem 2.3. Let n = (n r . . . nin ) p . Then the number of factors p in the 
prime factorization of n\ is 

n—(n r -\ h n\ + n ) 

Remark 2.6. The fact that the last fraction is a whole number is true whether 
p is prime or not, and is well known if p = 10: any positive integer has the same 
remainder modulo 9 as the sum of its digits. It can be proved for an arbitrary p 
precisely as it is proved for p = 10 in elementary textbooks. 

Proof. Induction on n. Denote by C p (n) the number of factors p in the prime 
factorization of n. If C p (n) = k, then n k -i = • ■ ■ = no = 0, n k ^ and n — 1 = 
(n r . . . nfe + i(nfc — l)(p — l)(p — 1) . . . (p—l)) p . According to the induction hypothesis, 



C p ((n-1)\) = 



(n — 1) — (n r H h n k+ i + n k - 1 + (p - l)k) 



p-1 



n ■ 



(n r H h n k ) 



and hence 



□ 



C p (n\)=C p ((n-l)\)+C p (n) = 



p-1 
n- (n r H h rife) 



p-l 



This Theorem provides a very efficient way of counting the number of prime 
factors in a binomial coefficient. For example, 



31241\ 
17101/ 



31241! 



17101! • 14140!' 
31241 = (1120212002)3, 

17101 = (212110101)3, 

14140 = (201101201)3, 



C 3 (31241!) = 
C 3 (17101!) 
C 3 (14140!) 



31241 - 11 



= 15615, 



17101 



14140 - 8 



8546, 



= 7066, 



/31241\ 

\i7ioi y 



C 3 (31241!) - C 3 (17101!) - C 3 (14140!) 
15615 - 8546 - 7066 = 3. 

/31241\ 



is divisible by 3; now we see 



We established in the previous section that , 

V 17101 / 

that the number of factors 3 in the prime factorization of this number is 3, that is, 
it is divisible by 27, but not divisible by 81. 

Our exposition would not have been complete if we had skipped a beautiful 
way to count the number of given factors in the prime factorizations of binomial 
coefficients due to one of the best number theorists of 19-th century. 
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Theorem 2.4 (Rummer, 1852). C„( ] equals the number of carry-overs in 

W 

the addition m + (n — m) = n in the numerical system with base p. 

For example, 17101 = (212110101) 3 , 14140 = (201101201) 3 . Perform the ad- 
dition: 

2 12 110 10 1 

+ 

2 1 1 1 2 1 



1 1 2 2 1 2 2 



There are 3 carry-overs (marked by asterisks), and the prime factorization of 
/31241\ 

contains 3 factors 3. 

\i7ioiy 

We leave to the reader the pleasant work of deducing Rummer's Theorem from 
the previous results of this section (see Exercise 2.5). 

2.6 Congruences modulo p 3 in the Pascal Triangle. It is much easier 
to formulate the results of Sections 2.6 and 2.7 than to prove them. Accordingly, 
we will give the statements of more or less all known results and almost no proofs. 
The reader may want to reconstruct some of the proofs (although they arc not 
elementary) and to think about further results in this direction. 

Lucas' Theorem (Section 2.4) implies that 

fpn\ = ( n \(°\ = f n " 
\pm ) \m J \0 J \ra, 



mod p. 



But experiments show that, actually, there are "better" congruences. For example, 
3 • 5\ /5\ 

should be divisible by 3; but, actually, 

= 5005 - 10 = 4995 = 185 -3 3 . 



3-2/ \2 

3-5\ /5\ /15\ /5 



v 3 • 2) \2J V 6 J \2 
Another example: 

T i) ~ (i) = 3003 ~ 3 = 3000 = 24 ' 53 ' 

And there is a theorem that states precisely what we see! 
Theorem 2.5 (Jacobsthal, 1952, [11]). Ifp> 5, then 

' pn\ f n N 



pm J \m 
is divisible by p 3 . 

(This is also true for p = 2 and 3, but with some "exceptions". Indeed, 

6*) ~ (3) = 3003 ~ 35 = 2968 = 371 ' 23 and (^3) ~ (l) = 216 = 8 ' 33 - 
But r J - ( 3 J = 15 - 3 = 12 = 3 • 2 2 and r J - ( 2 J = 20 - 2 = 18 = 2 • 3 2 . For 



further results see Exercises 2.6, 2.7.) 
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We shall not prove Jacobsthal's theorem here; we shall restrict ourselves to a 
more modest result. 

PROPOSITION 2.7. For any prime p and any m and n, 

' pn\ I n N 
K pm J \m j 

is divisible by p 2 . 

Proof. We shall use the following fact following from Lucas' Theorem: if n is 
divisible by p and m is not divisible by p, then ^ " ^ is divisible by p. (Indeed, if 

n = pr and m = ps + t, < t < p, then [ U ) = [ ] (] = mod p.) 

W \ s J \tj 

Now, we use induction on n (for n = 1, we have nothing to prove). Assume 
that the statement with n — 1 in place of n is true. Consider the equality 

(a + b) pn = (a + 6)P(™- 1 ) • (a + b) p . 

Equating the coefficients of a P m {f( n - m ) j we g e t the following: 

' pn\ f p(n — 1)\ fp\ (p(n — 1)\ f p^ 



pm ) \ pm ) \0j \ pm — 1 J \1 



+ 



P(n-l) W p \ /p(n-l)\A> 
— 1/ \ pm — p J \p 



On the right hand side of the last equality every summand, with the exception 
of the two extreme ones, is a product of two numbers divisible by p; hence, each of 
these summands is divisible by p 2 and 

'pn\ = fp{n- 1)\ (pin - 1) 
\j>m) \ pm J \p{m — 1) 

By the induction hypothesis, 

^p(n — 1)\ / p(n — 1) \ /n — 1\ / n — 1 \ 

5 ' - 1 mod /r. 



mod p . 



/ \j»(m — 1)/ \ m J \m—lj \m 
whence our result. □ 

Is it possible to enhance Jacobsthal's result? In some special cases it is pos- 
sible (see next section). In general, it is unlikely Let us mention the following 
(unpublished) result. 



Theorem 2.6 (G. Kuperberg, 1999). If 

Cf) s (0 mod/ ' 

then 

( Pn )^( n )modp* 
\pm J \m J 

for every m, n. 
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However, this property [ ( = ( \ J mod p 4 ) does not hold too often. Ac- 



cording to G. Kuperberg (who has a heuristic "proof" that it is true for infinitely 
many primes p), the smallest prime number for which it holds is 16,483. 

Still, for some special m and n, congruence modulo a much higher power of p 
may hold. We shall consider some results of this kind in the next (last) section. 

2.7 Congruences modulo higher powers of p. Let us consider numbers 



of the form 



2™ 



None of these numbers is divisible even by 4. But let us examine their successive 
differences: 



3-G 
3-G 



= 6-2 



70 - 4 = 64 



16 



9 



32\ /16 
16 



12870 - 70 = 12,800 = 25-2 



601080390 - 12870 = 601067520 = 146745 • 2 12 . 



Consider similar differences for bigger primes: 
'9\ /3 



27\ /9 
9 J " V3 

25\ /5 

T) - G 



= 84 - 3 = 81 = 3 4 
= 4686825 - 84 = 3143 • 3 7 
= 53130 - 5 = 17- 5 5 
= 85900584- 7 = 5111 • 7 5 



Let us try to explain these results. 

Theorem 2.7 (A. Schwarz, 1959). Ifp>5, then 

mod p 5 . 



P 2 \ - (P\ m „A .5 



pj V 

Remark 2.8. (1) This result was never published. A. Schwarz, who is now 
a prominent topologist and mathematical physicist, does not himself remember 
proving this theorem. However, one of the authors of this book (DF) was a witness 
to the event. 



38 



LECTURE 2. ARITHMETICAL PROPERTIES OF BINOMIAL COEFFICIENTS 



(2) We do not know whether the congruence holds modulo p 6 for any prime p. We 
realize that modern software can, possibly, resolve this problem in a split second. 

To prove Schwarz's Theorem, we shall use the following extended notion of a 

TTl 

congruence. We shall say that a rational number r = — (where the fraction is 

assumed irreducible) is divisible by p k , if m is divisible by p k and n is not divisible 
by p. For rational numbers r, s, the congruence r = s mod p k means that r — s is 

divisible by p k . (For example, - = 2 mod 3 2 .) These congruences possess the usual 

5 

properties of congruences: if r = s mod p k and s = t mod p k , then r = t mod p k ; if 
r = s mod p k and the denominator of t is not divisible by t, then rt = st mod p k ; 
etc. 

Lemma 2.9. For a prime p > 5, 

1 + ' 



2 p-1 

is divisible by p 2 . 

Proof of Lemma. Let p = 2q + 1 (since p > 5, p is odd). Then 

1 + 1 + ... + -^ + Wi + ^l + -+(i + 



2 p— 1 \ p— 1/ \2 p — 2/ p — g 

/ll 1 
= P' 7 + ^7 + + 



p-1 2(p - 2) g(p - g) 

and all we need to prove is that 

1 1 1 



p-1 2(p - 2) q(p - q) 

is divisible by p. 

Sublemma 2.9.1. For every i = 1, . . . ,p— 1 £/iere errisfc a unique Sj, 1 < Sj < 
p — 1 smc/i f/iaf zSj = 1 mod p. Moreover, 

(a) s p _ 4 = p - s i7 - 

(6) £/ie numbers Si, S2, ■ ■ ■ , s p -i /orm a permutation of the numbers 1,2,..., 
p-1. 

Proof of Sublemma. For a given i, consider the numbers i, 2i, . . . , (p— l)i. None 
of these numbers is divisible by p, and no two are congruent modulo p (indeed, if 
ji = ki mod p, then — fcz — (j — k)i is divisible by p, which is impossible, since 
neither i, nor j — k is divisible by p). Hence, the numbers i, 2i, . . . , (p — l)i have 
different remainders mod p, and since there are precisely p — 1 possible remainders, 
each remainder appears exactly once. In particular, there exists a unique j such that 
ji = 1 mod p; this j is our Sj. Statements (a) and (b) are obvious: (p— i)(p — Sj) = 
p 2 — p(i + ,Si) + isj = 1 mod p, and since the numbers si, S2, . . . , s p _i arc all different, 
they form a permutation of 1, 2, ... ,p — 1. □ 



Example: if p = 11, then s\ = l,s 2 = 6, S3 = 4, S4 = 3,ss = 9,S6 = 2, S7 = 
8,s 8 = 7, s 9 = 5, s 10 = 10. 
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1 1 is- — 1 

Back to the Lemma. Since — = Sj mod p (indeed, = which is 

i i i 

divisible by p), 

III 

7 + 7T7 tt H 1" —, 7 = SiSp-i + S2S p - 2 H h s q s p - q mod p 

p - 1 2(p - 1) g(p - g) 

(where, as before, p = 2q + 1). Of the two numbers s,, s p _i = p — s i; precisely one 
P 

is less than -. Hence, the numbers Sis p _i, s 2 s p _ 2 , • • ■ , s q s p - q form a permutation 
of the numbers l(p — 1), 2(g — 2), ... , g(p — q), and 

+ 777 ~ TV + • • ■ + , 1 , = HP ~ 1) + 2(p - 2) + ■ ■ ■ + g (p - q) 



P-I 2(p - 1) " ' q(p-q) 

= p(l + 2 + • • • + g) - (l 2 + 2 2 + • • • + q 2 ) 



pg(q + 1) qjq + l)(2g + 1) _ pg(g + 1) _ Q ^ 



as needed. □ 



Proof of Schwarz's Theorem. 

V\ M _ p 2 (p 2 - i) ■■■{p 2 - (p- i)) 



p) \i) i (p-i)p 

! [(l-p 2 )(2-p 2 )...((p-l)-p 2 )-l-2 (p-1)], 



(P-1)1 

and all we need to prove is that 

(1 -p 2 ){2-p 2 ) ...((p- 1) -p 2 ) - 1 - 2 (p- 1) = Omodp 4 . 

But 

(l-p 2 )(2-p 2 )...((p-l)-p 2 ) =1-2 (p-1) 

-p 2 (l + i + - + ^)( P -D! 

+ terms divisible by p 4 , 

whence 

(l-p 2 )(2-p 2 )...((p-l)-p 2 )-l-2 ( P -l) = 

-P 2 (l + \ + ■ ■■ + ^-^j (p-1)! modp 4 
which is divisible by p 4 by Lemma 2.9. □ 

Many of the congruences considered above are contained in the following (also 
unpublished) result. 

Theorem 2.8 (M. Zieve, 2000). Ifp > 5 then, for any positive integers k, m, n, 
mp K J \mp K L J 

We shall not prove this theorem here, but will restrict ourselves (as we did in 
the case of Jacobsthal's Theorem) to a more modest result. 
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2 k 



PROPOSITION 2.10. Ifk>2 then 



2 fc+i 
2 k 



2k- 



1 j is divisible by 2 2fe+2 . 



Proof. We shall use the following fact: if < m < 2 k then 



2 k 



in 



is divisible by 



2 . (The following, more general, result follows from Kummer's Theorem, Section 



2.5: if n is divisible by p k and m is not divisible by p, then 



is divisible by p k .) 



The difference 
polynomial 

2 k+i 



2 fe+i 
2 k 



2 k 
2 k-i 



in question is the coefficient of x 2 in the 



(1 + x) 2 ^ - (1 - x 2 ) 2 = (1 + x) 2 [(1 + x) 2 - (1 - x) 2 } 

/ 2 k\ (2 k ^ 



1 + 



x + 



x 2 H hi 2 ' 



■2 



.x- 5 + • • • + 



2 fe 

2 k - 1 



Since the second polynomial in the last expression contains only odd degrees, the 
coefficient of x 2 in the product is 

'2 k \ ( 2 k \ (2 k \ ( 2 k \ ( 2 k \ /2 fcx 1 

,2 fe -3, 



2 k - 1 



+ ••• + 



2 k - 1 



Every binomial coefficient in the last expression is divisible by 2 k by the remark 
at the beginning of the proof; hence every summand in the last sum is divisible by 
2 2fc . Also, every summand in this sum is repeated twice, and also there is a factor 
2 before the sum. Thus the whole expression is divisible by 2 2fc+2 . □ 

Let us mention, in conclusion, A. Granville's dynamic on-line survey of arith- 
metical properties of binomial coefficients [37] . 



John Smith 
January 23, 2010 



Martyn Green 
August 2, 1936 



Henry Williams 
June 6, 1944 



John Smith 
January 23, 2010 



Martyn Green 
August 2, 1936 



Henry Williams 
June 6, 1944 
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i v 3 y V5 



" X tan0- ("]tan 3 + (?)tan 5 ^-. 



1- [ " tan 2 0+ r tan 4 0- (" ) tan 6 + . 



2.8 Exercises. 

2.1. Prove that 

tan(n#) = 

v 2; " 1 \a) 1 v (> 

2.2. Prove that for the hyperbolic functions 

. , . . e x — e~ x , , . e 21 + e _a: sinh(x) 

smn(x) = , coshlx) = , tanh(x) = — — 

y ' 2 y ' 2 y ' cosh(x) 

formulas hold similar to those in Section 2.3with all the minuses replaced by pluses. 

J with < to < 2 100 , < n < 2 100 

are odd? 

(777- — I - 77v \ 
J with < to < 2 100 , < n < 2 100 

are not divisible by 4? 

2.5. Prove the Kummer Theorem 2.4 (deduce it from Theorem 2.3). 

2.6. (a) Prove that | ] — | ] ^ mod 2 3 for infinitely many pairs (to, n) . 

\2mJ \mj 

Namely prove that ^ ^ ^ — = mod 2 3 if and only if {^j IS cven ' that IS > ^ 
n = or 1 mod 4. 

(b) Prove also that ( — ( ^] = mod 2 3 if and only if n ^ 3 mod 4. 

?)-(;)■ (?)-(: 



. 4 / V 2 , 

The reader is encouraged to consider the differences 
and so on. 



2.7. (a) Prove that - (^j mod 3 3 if and only if n = 2 mod 3. 

(b) Prove that (^^j - (^j = mod 3 3 for all n. 



We do not know whether 

3n\ ( n 



3m J \m 



mod 3 3 



for all to, n with 2 < to < n. 



1 °° /2 \ 1 

2.8. (a) Prove that for Ixl < - the series > ( }x n converges to . 

(b) Deduce from this (or prove directly) that for any n 

i ■ ( 2n ) + ( 2 ) ( 2{n -} y ) + (i) ( 2{n -?) + • ■ ■ + ( 2n ) ■ i = 4». 



4x 



n/ n— 1 / \2/\ ra — 2 7 In 
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2.9. Let 

(In 

_ (2n)! _ U 



n!(n+l)! n+l' 
these numbers are called the Catalan numbers. 

(a) Prove that all the Catalan numbers are integers (the first 5 Catalan numbers 
are 1, 2, 5, 14, 42). 

OO 

Let C(x) = ^C n x n . 

n=0 

(b) Prove that 



{xC{x))' = 



OO 

£ 

n=0 



(c) Deduce from (b) and Exercise 2.8(a) that (for < \x\ < i) 

1 - VI -4a; 

c{x) = Yx ' 

(d) It follows from (c) that xC(x) 2 — c(x) + 1 = 0. Deduce that for any n > 1 

C n = y ^ CpCg. 

The remaining parts of this exercise have solutions based on the last formula. 

(e) Let * be a non-associative multiplication operation. Then the expression 
a*b*c may mean (a * b) * c or a* (b*c). Similarly, a*b*c*d may have 5 different 
meanings ((a * &) * c) * d, (a*b)*(c*d), (a*(6*c)), a * ((6 * c) * d), a* (6* (c* d)). 
Prove that the number of meanings which the expression a\ * ■ ■ ■ * a n+ i may have, 
depending on the order of multiplication, is C n . 

(f ) Let P be a convex n-gon. A triangulation of P is a partition of it into n — 2 
triangles whose vertices are those of P. For example, a convex quadrilateral ABCD 
has 2 triangulations: ABC U AC-D and ABD U BCD. A convex pentagon has 5 
triangulations (draw them!). Prove that the number of triangulation of a convex 
n-gon is C„_ 2 - 

See exercise 6.19 in R. Stanleys book [73] for 66 different combinatorial inter- 
pretations of the Catalan numbers; see also an on-line addendum [74] for many 
more. 




LECTURE 3 

On Collecting Like Terms, on Euler, Gauss and 
MacDonald, and on Missed Opportunities 

3.1 The Euler identity. In the middle of 18-th century, Leonhard Euler 
became interested in the coefficients of the polynomial 

tp n (x) = {l-x){l- x 2 )(l - x 3 ) ... (1 - x n ). 



He got rid of parentheses - and obtained the following amazing result: 





= 1 


— x 
















= 1 


— x 


-x 2 +x 3 














= 1 


— x 


-x 2 


+x 4 +x 5 -x 6 










ip 4 (x) 


= 1 


— x 


-x 2 


+2x 5 




-x 8 


-x 9 


+x 10 


ip 5 (x) 


= 1 


— x 


-x 2 


+x 5 +x 6 


+x 7 


-x s 


-x 9 


-x 10 . . 


if 6 (x) 


= 1 


— x 


-x 2 


+x 5 


+2x 7 




-x 9 


-x 10 . . 


(p 7 (x) 


= 1 


— x 


-x 2 


+x 5 


+x 7 


+x 8 




-x 10 . . 


¥>s(x) 


= 1 


— x 


-x 2 


+x 5 


+x 7 




+x 9 




tp 9 (x) 


= 1 


— x 


-x 2 


+x 5 


+x 7 






+x 10 '. '. 




= 1 


— x 


-x 2 


+x 5 


+x 7 









The dots mean the terms of the polynomials which have degrees > 10 (we have no 
room for them all: for example, the polynomial (fio(x) has degree 55). 

Following Euler, let us make some observations. First (not surprisingly), the co- 
efficients of every x m become stable when n grows; more precisely, (p m+ i(x), </? m+2 (x), 
<Pm+3(x),... all have the same coefficient of x m . (It is obvious: <p m+ i(x) = 
t -Pm{x){l — x" l+1 ), <p m +2(x) = <p m+ i(l — x m+2 ), . . . ; hence multiplication by 1 — x™ 
with n> m does not affect the coefficient of x m .) Because of this, we can speak of 
the "stable" product 

oo 

^(x) = ^(x) = U(l-x n ); 

n=l 
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it is not a polynomial any more, it is an infinite scries containing arbitrarily high 
powers of x. We will sometimes call tp(x) the Euler function. 

The second observation (more surprising) is that when we collect terms in the 
product (1 — x)(l — x 2 ) ... (1 — x n ), many terms cancel. For example, when we 
multiply (1 — x)(l — x 2 ) ... (1 — a; 10 ), there will be 43 terms with x to the powers 
through 10, and only 5 of them (1, —x, —x 2 ,x 5 ,x 7 ) survive the cancellations. This 
phenomenon becomes even more visible when we make further computations; here 
is, for example, the part of the series ip(x) containing all the terms with x to the 
power < 100: 



<p(x) — 1 — x — x 2 + x 5 + x 7 —x 12 — x 



15 2-22 _|_ x 26 _ ^35 _ ^40 
+X 51 + X 57 - X 70 - X 77 + X 92 + X W0 + ... 



Euler, who was extremely good with long computations, probably calculated almost 
this many terms. And after this he simply could not help noticing that all the non- 
zero coefficients of this series are ones and negative ones and that they go in a 
strictly predetermined order: two ones, two negative ones, two ones, two negative 
ones, and so on. If you look at the table below, you can guess (as Euler did) the 
powers of x with non-zero coefficients: 



exponents 





1,2 


5,7 


12,15 


22,26 


35,40 


51,57 


70,77 


92,100 


coefficients 


1 


-1 


1 


-1 


1 


-1 


1 


-1 


1 



This table suggests that the term x'" 2 (n > 0) appears with the coefficient 
(— 1)™, and there are no other non-zero terms. This conjecture may be stated in 
the form 

(1 - x){\ - x 2 ){\ - x 3 ) . . . = 1 - x - x 2 + x 5 + x 7 + . . . 

+{-l) n x^ L + (-l)^ 2 ^ + ..., 

or, shorter, 



oo oo 

n (i - * b ) = i + E(-!) r + 



n=l 



r=l 



or, still shorter, 



n (1-*")= e (-i) r ^- 



n=l 



3n 2 ± n 

By the way, the numbers arising in this formula arc known as "pen- 

tagonal numbers" (or "Euler pentagonal numbers"). The reason for this name is 
clear from Figure 3.1 (the black-dotted pentagons have the same number of dots 
along each side). 

It is quite interesting that although the proof of Euler's identity looks short 
and elementary (see Section 3.3), Euler, who did so many immensely harder things 
in mathematics, experienced difficulties with the proof. His "memoir" dedicated 
to this subject and published in 1751 under the title "Discovery of a most extra- 
ordinary law of the numbers concerning sums of their divisors" (the reader should 
wait until Section 3.5 for an explanation of this title) did not contain any proof of 
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1,2 5,7 12,15 22,26 35,40 51,57 



70,77 



92,100 



117,126 



Figure 3.1. Pentagonal numbers 



the identity. A relevant extract from the memoir (taken from the book of G. Polya 
[62]) is presented below. 

3.2 What Euler wrote about his identity. "In considering the partitions 
of numbers, I examined, a long time ago, the expression 

(1 - x)(l - x 2 )(l - x 3 )(l - x 4 )(l - x 5 )(l - x 6 )(l - x 7 )(l -x 8 )..., 

in which the product is assumed to be infinite. In order to see what kind of series 
will result, I multiplied actually a great number of factors and found 

1 - x - x 2 + x 5 + x 7 - x 12 - x 15 + x 22 + x 26 -x 35 ~x 4a + ... 

The exponents of x are the same which enter into the above formula; 1 also the 
signs + and — arise twice in succession. It suffices to undertake this multiplication 
and to continue it as far as it is deemed proper to become convinced of the truth 
of these series. Yet I have no other evidence for this, except a long induction which 
I have carried out so far that I cannot in any way doubt the law governing the 
formation of these terms and their exponents. I have long searched in vain for a 
rigorous demonstration of the equation between the series and the above infinite 
product (1 — x)(l — x 2 )(l — x 3 ) . . . , and I proposed the same question to some of 
my friends with whose ability in these matters I am familiar, but all have agreed 
with me on the truth of this transformation of the product into a series, without 
being able to unearth any clue of a demonstration." 

3.3 Proof of the Euler identity. Let us collect terms in the product 

(l-.T)(l-a; 2 )(l-a; 3 )(l-a; 4 )... 
We shall obtain the (infinite) sum of the terms 

(-l) k x ni+ - +nk , k > 0, < m < ■ ■ ■ < n k . 
The total coefficient of x n will be 



1 This is a reference to a preceding part of the Memoir containing an explanation of the 
sequences 1, 5, 12, 22, 35, . . . and 2, 7, 15, 26, 40, ... . 
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The number of partitions 

n = ni H h n k 

(0<m < ••• <n k ) 
with even k 



The number of partitions 
n = ni H \-n k 

(0<m <■■■ <n fc ) 

with odd A: 



We want to prove that the two numbers in the boxes are usually the same, and in 
some exceptional cases differ by 1. 

For a partition n = n\ + ■ ■ ■ + n k , < n\ < ■ ■ ■ < n k we denote by s = 
s(ni, . . . , n k ) the maximal number of n^'s, counting from n k to the left, which form 
a block of consecutive numbers (that is, of the form a, a + 1, . . . , a + b). In other 
words, s is the maximal number satisfying the relation n k _ s+ i = n k — s + 1. (Thus, 
1 < s < k.) 

We shall distinguish 3 types of partitions n = n\ + ■ ■ ■ + n k , < rii < • • • < n k 

Type T. n\ < s, excluding the case n\ = s = k. 

Type 2: ri\ > s, excluding the case ni = s + 1 = k + 1. 

Type 3: the two excluded cases, n\=s = kovni = s+ l = k+l. 

Here is a 1 — 1 correspondence between partitions of n of Type 1 and partitions 
of n of Type 2: 

s consecutive numbers s 



nin 2 . 



■n k -in k 



n 2 ■ 



n k -in k 

T---T T 
l ... l l 



n 2 ■ ■ .Uh-m+i ■ ■ - n k 
+ 1 ... +1 



(m, . .. ,n k ) i-» (mi, . . .,m fe _i), 



m, 



_ f rii+i, if i 
\ n i+ i + 1, if i 



In words: we remove the number ri\ from the partition, then split it into n\ ones, 
and then add these ones to ni last (biggest) terms of the partition (it is important 
that if s = ni, then s < k; otherwise we shall have to remove n\ and then to add 1 
to m, but it is not there anymore). In formulas: 

< k — ni, 
> k — n\. 

Examples: 

13 = 1 + 3 + 4 + 5; (1,3,4,5)^(^,3,4, 5) = (3,4,6) 
37 = 2 + 5 + 9+ 10 + 11; (2,5,9,10,11) ^ (^,5,9,10,11) = (5,9,11,12) 

The partition mi, ... , m,fc_i belongs to Type 2. Indeed, mi > n 2 > rii = s(mi, . . . , 
mfc_i) and if mi = s(mi, . . . , mfc_i)+l = (fc— 1)+1, then, on one hand, mi = ni+1, 
and on the other hand, n\ + 1 = k, hence nij = rij+i + 1, if i > k — n\ = 1, hence 
mi = n 2 + 1; this is not possible, since n 2 > n\. 

The fact that the above transformation is 1 — 1 follows from the existence of 
an inverse transformation: 

s consecutive numbers s 



mi m,fc_i i ► 




smi . . . mfc_ 



-1 



m k -i 
. - 1 
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(that is, we subtract 1 from each of the s consecutive numbers in the right end, 
collect these ones into one number s and place this s before mi) or, in formulas: 

!s, if i = 1, 

mi_i, ii2<i<k - s, 
rrii-i — 1, if i > k — s. 

Examples: 

(3,4,6) -> (3,4, _6)^ (1,3, 4, 5) 
(5, 9, 11, 12) i ► (5, 9, 11, 12) ~ (2, 5, 9, 10, 11) 

The terms 

(_ 1 )fe a .ni+-+n* and ^_ 1 ^k-l x m 1 + -+m k - 1 

corresponding to each other cancel in the product (1 — x)(l — x 2 )(l — x 3 ) . . . , and 
there remain only terms corresponding to partitions of Type 3. These are 

fcfc + 1 ... 2fc-l and k + 1 k + 2 . . . 2k, 

and the corresponding terms in (1 — x)(l — x 2 )(l — x 3 ) . . . are 

(_ 1 )fe x fe+(fe+l) + ... + (2fe-l) = ^J)^^ 

and 

(_ 1 )fc a .(fc+l)+(fc+2)+...+2k = (.i)^^. 

□ 

Next we shall show two applications of the Euler identity. 

3.4 First application: the partition function. The word "partition" which 
we have been using before as a common English word, actually has a well estab- 
lished meaning in combinatorics. From now on, we will use this word according to 
the tradition: we call a partition of a number n a sequence of integers m, . . . ,nk 
such that n = n\ + • • • + and < n\ < ■ ■ ■ < n^. We hope that this termino- 
logical shift will not cause any difficulties, but still want to mention that partitions 
considered in Section 3.3 are partitions of a special kind: with all parts rij different. 

For a positive integer n, denote by p(n) the number of partitions n = n\ + 
• • • + rife, k > 0, < ni < • • • < rife. Compute p(n) for small values of n: 

P(l) =1 

p(2) =2 (2=1 + 1) 

p(3) =3 (3=1 + 2=1 + 1 + 1) 

p(4) =5 (4=1 + 3 = 2 + 2=1 + 1 + 2=1 + 1 + 1 + 1) 

Can you find p(10)? It is not hard, although you might not be able to get the right 
result from the first try. The answer is p(10) = 42. And what about p(20)? p(50)? 
p(100)? It turns out that we can find these numbers relatively quickly if we use 
the Euler identity. 

Consider the series 

oo 

p(x) = 1 + x + 2x 2 + 3a; 3 + 5x 4 + • • • = 1 + ^ p(r)x r . 

r=l 

Theorem 3.1. ip(x)p(x) = 1. 
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Proof. 



IK* 



(thus, the series 



i 



• x 2n + x 3n 



+ ...) 



is itself a product of infinitely many series). What is the 



ip(x) 

coefficient of x r in the product 

(1 + x + x 2 + . . . )(1 + x 2 + x 4 + . . . )(1 + x 3 + x 6 + . . . ) . . . ? 

We need to take one summand from each factor (only finitely many of them should 
be different from 1) and multiply them up. We get: 



x 1 ^ ■ x 2 ^ 



fci+2fc 2 H \-mk„ 



We want to count the number of such products with k\ + 2fc 2 H 1- tnk 

is, the number of presentations 

r = fcj + 2fc 2 + • • • + mk m = ! + ••• + ! + 2 + ••• + 2 + •< 



r, that 
m + ■ ■ ■ + m, 



that is, the number of partitions of r. Thus, the coefficient of x r in 
to p(r). □ 



Now use the Euler identity: 

(1 - x - x 2 + x 5 + x 7 - x 1 



1 



ip(x) 



is equal 



2 - x 15 . . . )(1 + p(l)x + p(2)x 2 + p(3K + ...) = !. 

that is, the coefficient of x n with any n > in this product is equal to 0. We get a 
chain of equalities: 



p(l) 


-1 = 








P(2) 


-p(l) - 


1 = 






P(3) 


-p(2) - 


P(l) = 







P(4) 


-p(3) - 


p(2) = 







P(5) 


-p(4) - 


P(3) + 


1 = 




P(6) 


-p(5) - 


p(4) + 


p(l) = 





P(7) 


-P(6) - 


p(5) + 


p(2) + 


1 = 


P(8) 


-P(7) - 


P(6) + 


P(3) + 


P(l) = 



p(n) = p(n - 1) + p(n - 2) - p(n - 5) - p(n - 7) + p(n - 12) + p(n - 15) - . . . 

where we count p(0) as 1 and p(m) with m < as 0. We can use this as a tool for 
an inductive computation of the numbers p(n): 



P(5) 


= p(4)+p(3) 


-1=5+3-1 


= 7 








P(6) 


= p(5)+p(4) 


-p(l) = 7+5- 


1 = 11 








P(7) 


= p(6)+p(5) 


-p(2) - 1 = 15 


+ 11-3- 


- 1 = 


= 22 




P(8) 


= p(7)+p(6) 


-p(3)-p(l) = 


15+ 11 - 


3 - 


1 = 


22 


p(9) 


= p(8)+p(7) 


-p(4)-p(2) = 


22+15- 


5- 


2 = 


30 


p(10) 


= p(9)+p(8) 


-p(4)-p(2) = 


30 + 22 - 


7- 


3 = 


42 



and further computations show that p(20) = 627, p(50) = 204,226, p(100) = 
190,569,791. 
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It is worth mentioning that our recursive formula for the function p may be used 
for constructing a very simple machine for computing the values of this function. 
This machine is shown in Figure 3.2. Take a sheet of graph paper and cut a 
long strip as shown in the left side of Figure 3.2 (the longer your strip is, the 
more values of the function p you will be able to compute) . In the upper cell of 
the strip draw a (right) arrow. Then write the plus signs in the cells numbered 
1,2,12,15,35,40,... (counting down from the arrow) and the minus sign in the cells 
numbered 5,7,22,26,... Write 1 (it is p(0)) in the lower left corner of the sheet of 
graph paper. Attach the right edge of your strip to the left edge of the sheet in 
such a way that the arrow is against 1. Then move the strip upwards, and every 
time when the arrow is directed into an empty cell (in the left column of the sheet) 
write in this cell the sum of numbers against the pluses minus the sum of numbers 
against the minuses. The numbers written are consecutive values of the function 
p. This procedure is shown in Figure 3.2, up to p(12). 

In conclusion we display an asymptotic formula for p(n) due to Rademacher: 

1 2x r- 

p(n) = ev^ v ™. 

Any/3 

This ~ means that the ratio of the expression in the right hand side to p(n) ap- 
proaches 1 when n goes to infinity. Among other things, this formula reveals that 
p(n) has a property that is rare for the functions usually occurring in mathematics: 
it grows faster than any polynomial but slower than any exponential function c". 

3.5 Second application: the sum of divisors. This application gave the 
name to Eulcr's memoir. In this section, we follow Euler's ideas. 

For a positive integer n, denote by d(n) the sum of divisors of n. For example, 

d(4) =1 + 2 + 4=7, 
d(1000) = 1 + 2 + 4 + 5 + 8 + 10 + 20 + 25 + 40 + 50+ 100+ 125 

+200 + 250 + 500 + 1000 = 2340, 
d(1001) = 1 + 7 + 11 + 13 + 77 + 91 + 143 + 1001 = 1344. 

Unlike the numbers p(n), the numbers d(n) are easy to compute, there is a sim- 
ple explicit formula for them. Namely, if n = 2 fc2 3 fes . . .p kp is a prime factorization 
of p, then 

ofc 3 + l _ i n k p + l _ i 

(see Exercise 3.3). Furthermore, it is interesting that there is a recursive formula for 
the numbers d(n), very similar to the formula for p(n) in Section 3.4 and relating 
the number d(ra) to seemingly unrelated numbers d(n — 1), d(n — 2), d(n — 5), ... . 
(For Euler, it was a step towards understanding the nature of the distribution of 
prime numbers.) 
Let 



d(x) = J2 d ( r ) x ' r = x + 3x 2 + 4a; 3 + 7a; 4 + 6a; 5 + 12x e 



.!i i — .( t o.i -<-i -r i.i -r u.<. t l^-"'' 
r=l 

Theorem 3.2. <p(x)d(x) + xip'(x) = 0. 

Here tp'(x) means the derivative of f(x). Thus, 

Xip'(x) = -x- 2x 2 + 5a; 5 + 7a; 7 - 12.t 12 - 15x 15 + . . . 
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Proof of Theorem. Consider the equality 

y = y n(x n + x 2n + x 3n + ...). 

^ l-x n ^ 

n—1 n—1 

If di, d,2, ■ ■ ■ , d m are divisors of r (including 1 and r), then x r appears in the last sum 
as di ■ x^'^i for every di, and the total coefficient of x r will be d\ + c?2 + • • • + d m — 
d(r). Thus, the sum is Y^rLi &{ r ) xT = d(x), that is, 

oo n 

d{x) = y^—. 

w ^ l-x n 

n=l 

But 

- -x ■ [ln(l - x n )]', 



1 - x 
thus 



d(x) = -x[Y J ln(l - x n ) j = -x ( In J| (1 - x n ) 

\ n=l 

= —x ■ [lmp(x)]' = — 



n—1 / \ n—1 / 

x<p'(x) 
ip(x) 

which shows that d(x)ip(x) + xtp'(x) = 0. □ 

Equating to the coefficient of x n , n > 0, on the left hand side of the last 
equality, we find that 

d(n) -d(n - 1) - d(n - 2) + d(n - 5) + d(n - 7) - . . . 

, ,,m3m 2 ±m 3m 2 ± m 

-(-i) —a—. ; f "7^— - 

0, if n is not a pentagonal number. 

It is better to formulate this in the following form: 

d(n) = d(n - 1) + d(n - 2) - d(n - 5) - d(n - 7) + d(n - 12) + d(n - 15) - . . . 

where d(k) with fc < is counted as 0, and d(0) (if it appears in this formula) is 
counted as n. 

3.6 The identities of Gauss and Jacobi. About 70 years after Euler's 
discovery, another great mathematician, Carl-Friedrich Gauss, proved that the cube 
of the Eulcr function provides a series even more remarkable than the Euler series: 

p(x) 3 = (1 - x) 3 (l - x 2 ) 3 (l - x 3 ) 3 ■■■ = l-3x + 5x 3 -7x 6 + 9x w - llx 15 . . . 

or 

oo oo 



na-* n ) 3 =^(-1)^+1^ 



n—1 r—0 

The Gauss identity appears even more remarkable, if we notice that the square 
of the Euler function does not reveal, at least at the first glance, any interesting 
properties: 

<p(x) 2 = l-2x-x 2 + 2x 3 +x A + 2x 5 - 2x e - 2x s - 2x 9 + x 10 . . . 
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Several proof are known for the Gauss identity, and they belong to very dif- 
ferent parts of mathematics, such as homological algebra, complex analysis, and 
hyperbolic geometry (this fact by itself may be regarded as an indication that the 
result is very deep). There exists also an elementary combinatorial proof (which we 
shall discuss in Section 3.7). Most of these proofs (including the proof in Section 
3.7) yield, actually, a stronger result: the two-variable Jacobi identity: 

oo oo 

(3.i) j](i + ir 1 * 2n - 1 )(i + y * 2n - 1 )(i-z 2n )= y r ^- 

n—1 r— — oo 

Before proving it, we shall show that it implies the Gauss identity. 

Deducing the Gauss identity from the Jacobi identity. Differentiate the two 
sides of the Jacobi identity (3.1) with respect to z, then put y = —z, and then put 

z z = X. 

To differentiate a product (even infinite) we need to take the derivative of one 
factor, leaving all the rest unchanged, and then add up all the resulting products: 

(/1/2/3 •••)' = /i/2/3 1- /1/2/3 1- /1/2/3 h . • • 

But the very first factor in the left hand side of the Jacobi identity, (1 + y~ 1 z), is 
annihilated by the substitution y = —z. Hence of all the summands in the derivative 
of the product, only one survives this substitution, and this is 



(1 + y-V*(l + - z 2 ) + IT 1 * 2 ""^! + yz 271 - 1 )^ - z 2n ). 

After the substitution y = —z we get (since (1 + y~ 1 z)' z = y^ 1 ): 

00 00 
-z-\l - z 2 ) 2 JJ(1 z 2n - 2 )(l - z 2n ) 2 = -z- 1 JJ(1 z 2n ) 3 . 

n=2 n=l 

The whole identity (3.1) becomes (since (y r z r2 )' z — r 2 y r z T ' 2 ~ 1 ) 
00 00 00 

- z 2n f = -z Y, r 2 {-\) r z r z r "- x = Y (-l) r+1 r 2 z r2+r 1 

n—1 r— — 00 r— — oo 

which becomes, after the substitution z 2 = x, 

00 2 
(3.2) ^{xf = Y {-l) r+1 r 2 x r -^ . 

r— — 00 

It remains to notice that the r-th and the (— r — l)-th terms on the right hand side 

t(*n ri f {-r - I) 2 + {-r - I) r 2 + r 

of (3.2) arc like terms: = . Hence, 

y ' 2 2 

Er=-oo(-i) r+l r 2 ^ - Er =0 [(-i) r+1 ^ 2 + (-i)- r (-^ - 1) 2 ]^ 

= EZo(-lY(2r+l)x^ 

as required. □ 



We remark, in conclusion, that the Jacobi identity may be used to prove other 
one- variable identities. For example, if we simply plug y = — 1 in the Jacobi identity 
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(3.1) (and then replace z by x), we get a remarkable identity 

oo oo 

(1 - x) 2 (l - x 2 )(l - x 3 ) 2 (l - x 4 ) • • • = (-l) r z r2 = 1 + 2^(-l) r x r2 



ip(x 2 )' 

we can get from it also a formula for f>{x) 2 ; 



also known to Gauss. By the way, the left hand side of this identity is — r^w- hence, 



oo oo 



X 



not as remarkable, however, as the formulas for ip(x) and ip(x) 3 . 

For another identity involving ip(x) and following from the Jacobi identity, see 
Exercise 3.4 . 

3.7 Proof of the Jacobi identity. This proof is due to Zinovy Leibenzon; 
we follow his article [50] and use his terminology. 
Rewrite the Jacobi identity as 

OO OO oo 

n(i+^ 2 "- i )(i+y- i ^ i - i )=n( i - z2n ) _i e y r * r2 

n—1 n—1 r— — oo 

oo oo oo 

= P (z 2 ) y, y rzT ' 2 = Ep( n ) z2 " E y r ' zr2 

r— — oo n— r— — oo 

and compare the coefficients of y r z 2n+r . On the right hand side, the coefficient is, 
obviously, p(n). On the left hand side, y r x 2n+r may appear as a product 

yz 2 ^- 1 yz 2a °- x ■ y^s 2 ^- 1 y^z 2 ^ 1 

where < ct\ < ■ ■ ■ < a s , < (3i < ■ ■ ■ < /3t, s — t = r, and 

s t 

^(2 aj -l) + ^(2^-l) = 2n + r 2 . 

i=i j=i 

Thus, the coefficient of y r x 2n+r2 is equal to the number of sets ((ai, . . . , a s ), 
(/?!,..., /3 t )) with the properties indicated. We denote this number by q(n, r). 
To prove the Jacobi identity, we need to prove the following. 

Proposition 3.1. q(n, r) = p(n) (m particular, q(n,r) rfoes nof depend on 

r). 

To prove the proposition, we need the following construction. 

By a chain we mean an infinite, in both directions, sequence of symbols of two 
types: O (circles) and | (sticks), such that to the left of some place only circles 
occur, and to the right of some place only sticks occur. Examples: 

OOOIIO I O I OOIIII 

OOIIIIOOOIOIIII 

We do not distinguish between chains obtained from each other by translations to 
the left or to the right. 

The height of a chain A, h(A), is defined as the number of inversions, that is, 
pairs of symbols (not necessarily consecutive) , of which the left one is a stick and 
the right one is a circle. For the two examples above, the heights are 13 and 17. 
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We shall assume that the distance between any two neighboring symbols in a 
chain is 2, and that between them, at distance 1 from each, there is a lacuna. The 
lacunas of a given chain can be naturally enumerated: we say that a lacuna T has 
index r, if the number of sticks to the left of T minus the number of circles to the 
right of T is equal to r. It is clear that when we move from left to right the index 
of the lacuna increases by 1. Example: 

...-o- 6 o- B o- 4 r 8 r a o- 1 i°o 1 i a o 8 o 4 i B i a i 



Proof of Proposition. We shall compute, in two ways, the number of chains of 
height n. 

First way. For a chain A of height n, denote by the number of circles to the 
right of the i-th stick from the left. Obviously, ri\ > n 2 > ■ ■ ■ ; rii = for i large 
enough; and n± + «2 + ■ ■ • = n. These numbers ni, n 2 , . . . determine the chain and 
may take arbitrary values (if they satisfy the condition above) . Thus, the number 
of chains of height n is p(n). 

Second way. Fix an integer r, and consider the lacuna T number r. Let there be s 
sticks to the left of T and t circles to the right of T; thus, s — t = r. Let the distances 
of the sticks to the left of T to T be (in the ascending order) 2ct\ — 1, . . . , 2a s — 1 
and the distances of the circles to the right of T to T, in the ascending order, be 
2/3i - 1, ... , 2p t - 1. Example: 



8 4 3 1 1 4 

•••O 00 I OOO I I O l T O I I O I |... 

The numbers s, t, ot\, . . . , a s , . . . , t determine the chain. Let us prove that 



s t 

2n + r 2 = J2( 2a * - 1) + Yfifa ~ l ^ 

i=i j=i 



There are 3 kinds of inversions in the chain A: (1) both the circle and the 
stick are to the left of T; (2) both the circle and the stick are to the right of T, 
and (3) the circle is to the right of T and the stick is to the left of T. Between a 
stick at distance 2on — 1 to the left of T and T (including this stick) , there are on 
symbols, of which i are sticks and on — i are circles; thus this stick participates in 
on — i inversions of the first kind, and the total number of inversions of the 1-st 
kind is J2i=i( a i ~ *)■ Similarly, there are Y?j=i(0j ~ i) inversions of 2-nd kind, 
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and, obviously, the number of inversions of the 3-rd kind is st. Thus, 



n 



i=i 



3 = 1 



E* „ s(s + 1) 



*(t+i) 



»=i j=i 

s t 

= E a *+E& 
.=1 j=i 



s 2 + s - 2st + t 2 + t 



E«*+E& 

»=i j=i 



r 2 + s + t 



2n + r 2 = 2J2^ + 2J2^ - g - t = E( 2 ^ ~ X ) + E( 2 ^ ~ 
*=i j=i *=i j=i 

We see that the number of chains of height n is q(n, r). 

Thus, p(n) = q(n, r) which proves the Proposition and the Jacobi identity. □ 

3.8 Powers of the Euler function. Thus far, we know how the series for 
<fi(x) and ip(x) 3 look like, but we have nothing equally good for (p(x) 2 . And what 
about the series ip(x) 4 , (p(x) 5 , etc.? In other words, for which n is there a formula for 
the coefficients of the series <p(x) n 7 To answer this informal (that is, not rigorously 
formulated) question, we shall use the following semi-formal criterion. If, for some 
n, there are many zeroes among the coefficients of the scries ip(x) n , this might 
mean that there is a formula for (p(x) n resembling the formulas of Euler and Gauss. 
(However, if there are only few zeroes, or no zeroes at all, this cannot be considered 
as a clear indication that a formula does not exist.) It is a matter of a simple 
computer program to find the number of zeroes among, say, the first 500 coefficients 
of ip(x) n . We denote this number by c(n), and here are the values of c(n) for n < 35: 



n 


1 


2 


3 


4 


5 


6 


7 


8 




c(n) 


464 


243 


469 


158 





212 





250 







9 


10 


11-13 


14 


15 


16-25 


26 


27-35 







151 





172 


2 





80 






We can make the following observation. For n = 1,3, there are very many 
zeroes (we already know this); for n = 2,4,6,8,10,14,26, the number of zeroes 
is substantial; for n = 15, there are 2 zeroes (which cannot be considered as a 
serious evidence of anything 2 ); for n = 5,7, 9, 11 — 13, 16 — 25, 27 — 35, there are 



2 Although a formula for ip(x) 15 exists, see below. 
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no zeroes at all. We should not be surprised by a substantial amount of zeroes 
for n — 2,4,6: the series for tp(x) and cp(x) 3 are so sparse that their products 
f(x) 2 = ip(x) ■ ip(x), ip(x) 4 = ip(x) ■ <f(x) 3 , ip(x) 6 — (p(x) 3 ■ ip(x) 3 may lack some 
powers of x even before collecting like terms. For example, the numbers 11, 18,21 
(and many others) cannot be presented as sums of pairs of numbers of the form 

^ n ' an< ^ f° r ^ s rcason there are no terms x 11 , x 18 , x 21 in the series for (p(x) 2 . 
For a similar reason, there are no terms x 9 ,x 14: ,x 19 in the series for ^(x) 4 , and 
no terms x 5 ,x s ,x 14 in the series for ip(x) e . But why are there many zeroes in the 
series for ip(x) s , tp(x) 10 , (p(x) 14: , and ip(x) 26 ? 

It turns out that there are formulas for these powers of the Euler function, 
not so simple as the formulas of Euler and Gauss, but also deep and beautiful. 
(There are also formulas for some other powers of the Euler function, but this is 
not reflected in our table.) As an illustration, let us show a formula for (p(x) 8 due 
to Felix Klein: 



where the summation on the right hand side is taken over all triples (k,l,m) of 
integers such that k + I + m = 1. One can see from the formula that if a number 
r cannot be presented as — (kl + km + Im) with k + I + m = 1, then the series for 
<p(x) 8 does not contain x r . For example, it does not contain x r if r = As + 3 (with 
s integral) or if r = 13, 18, 28, 29 (see Exercise 3.5). 

We see from all this that there exist some "privileged exponents" n for which a 
comprehensible formula for <p(x) n exists. The mystery of privileged exponents was 
resolved in 1972 by Ian MacDonald (see Section 3.9 for a partial statement of his 
results) . An account of this discovery is contained in an emotionally written article 
of F. Dyson [26]. A couple of words should be said about Dyson and his article. 
Freeman Dyson is one of the most prominent physicists of our time. He started his 
career as a mathematician and has some well known works in classical combinatorics 
and number theory. The goal of his article was to show how lack of communication 
between physicists and mathematicians resulted in a catastrophic delay of some 
major discoveries in both disciplines. Below is an excerpt from Dyson's article 
related to our subject. 

3.9 Dyson's story. "I begin with a trivial episode from my own experience, 
which illustrates vividly how the habit of specialization can cause us to miss op- 
portunities. This episode is related to some recent and beautiful work by Ian 
MacDonald on the properties of affine root systems of the classical Lie algebras. 

I started life as a number theorist and during my undergraduate days at Cam- 
bridge I sat at the feet of the already legendary figure G. H. Hardy. It was clear 
even to an undergraduate in those days that number theory in the style of Hardy 
and Ramanujan was old-fashioned and did not have a great and glorious future 
ahead of it. Indeed, Hardy in a published lecture on the r-function of Ramanujan 
had himself described this subject as "one of the backwaters of mathematics" . The 
r-function is defined as the coefficient in the modular form 




DC 



oo 



(3.3) 




n=l 



m—1 
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Ramanujan discovered a number of remarkable arithmetical properties of r(n). The 
proof and generalization of these properties by Mordell, Hecke, and others played 
a significant part in the development of the theory of modular forms. But the r- 
function itself has remained a backwater, far from the mainstream of mathematics, 
where amateurs can dabble to their hearts' content undisturbed by competition 
from professionals. 3 Long after I became a physicist, I retained a sentimental 
attachment to the r-function, and as a relief from the serious business of physics 
I would from time to time go back to Ramanujan's papers and meditate on the 
many intriguing problems that he left unsolved. Four years ago, during one of 
these holidays from physics, I found a new formula for the r-function, so elegant 
that it is rather surprising that Ramanujan did not think of it himself. The formula 
is 



summed over all sets of integers ai, . . . , a 5 with aj = i mod 5, a\ + • • • + a§ — 

0, a\ + h a% = 10n 2 . This can also be written as a formula for the 24-th power 

of the Euler function ip according to (3.3). I was led to it by a letter from Winquist 
who discovered a similar formula for the 10-th power of tp. Winquist also happens 
to be a physicist who dabbles in old-fashioned number theory in his spare time. 

Pursuing these identities further by my pedestrian methods, I found that there 
exists a formula of the same degree of elegance as (3.4) for all d-th powers of ip 
whenever d belongs to the following sequence of integers: 



In fact, the case d = 3 was discovered by Jacobi, the case d — 8 by Klein and 
Fricke, and the cases d — 14, 26 by Atkin. There I stopped. I stared for a little 
while at this queer list of numbers (3.5). As I was, for the time being, a number 
theorist, they made no sense to me. My mind was so well compartmentalized that 
I did not remember that I had met these same numbers many times in my life as 
a physicist. If the numbers had appeared in the context of a problem in physics, I 
would certainly have recognized them as the dimensions of finite-dimensional simple 
Lie algebras. Except for 26. Why 26 is there I still do not know 4 . So I missed the 
opportunity of discovering a deeper connection between modular forms and Lie 
algebras, just because the number theorist Dyson and the physicist Dyson were not 
speaking to each other. 

This story has a happy ending. Unknown to me the English geometer, Ian Mac- 
Donald, had discovered the same formulas as a special case of a much more general 



3 In a footnote to a Russian translation of Dyson's article (published in 1980), the translator 
noticed that it was difficult for him even to imagine that it could ever be so. 

4 Let us provide a short explanation. Rotations of the plane around a point depend on 1 
parameter: the angle of rotation. Rotations of three-dimensional space depend on 3 parame- 
ters: the latitude and longitude of the axis of rotation and the angle of rotation. In general, 

rotation of an n-dimcnsional space depend on - parameters, and rotations of a complex 

n-dimensional space depend on n 2 — 1 parameters. To the numbers and n 2 — 1, that 

is, 1, 3, 6, 10, 15, 21, 28, 36, . . . and 3, 8, 15, 24, 35, . . . one should add five "exceptional dimensions" 
14, 52, 78, 133, 248. If one also removes, as Dyson does, the number 1 and 6, and adds 26 (which 
appears here, according to a more modern explanation, as 52 -=- 2), then the sequence (3.5) arises; 
certainly, any theoretical physicist remembers this sequence very firmly. 



(3.4) 




(3.5) 



d = 3, 8, 10, 14, 15, 21, 24, 26, 28, 35, 36, . . . 



LECTURE 3. COLLECTING LIKE TERMS AND MISSED OPPORTUNITIES 



57 



theory. In his theory, the Lie algebras were incorporated from the beginning, and it 
was the connection with modular forms which came as a surprise. Anyhow, Mac- 
Donald established the connection and so picked the opportunity which I missed. It 
happened also that MacDonald was at the Institute for Advanced Study in Prince- 
ton while we were both working on the problem. Since we had daughters in the 
same class at school, we saw each other from time to time during his year in Prince- 
ton. But since he was a mathematician and I was a physicist, we did not discuss 
our work. The fact that we were thinking about the same problem while sitting 
so close to one another only emerged after he had gone back to Oxford. This was 
another missed opportunity, but not a tragic one, since MacDonald cleaned up the 
whole subject without any help from me." 

3.10 MacDonald's identities. We finish this lecture with an infinite collec- 
tion of identities which comprise a substantial part of MacDonald's work mentioned 
by Dyson. The first formula generalizes the Jacobi identity (which corresponds to 
the case n = 2): 



n 



fe=i 



• \ / - • • • ^ n 



-1 / \ X\ . . . X<i—\Xj . . . X n ^ 

^ • • • , A: n )x 1 1 . . . x^ n 

where the summation on the right hand side is taken over all n-tuples of non- 
negative integers (fci, . . . , k n ) satisfying the equation 

(3.6) k\ + • • • + kl = fei + • • • + k n + feife + • • • + K-iK + k n ki 

and s(ki, . . . , k n ) — ±1 is defined in the following way. If the numbers k\ , . . . , k n 
satisfy equation (3.6), then so do the numbers k\, . . . , fci_i, k^, ■ ■ ■ ,k n where 
k\ = —ki + + + 1 (here 1 < i < n; if i = n, we should take x\ for 
and if i = 1 then we should take x n for Xi-i). Moreover, any n-tuple k\, . . . , k n of 
non-negative integers satisfying equation (3.6) can be obtained from (0, . . . , 0) by a 
finite sequence of such transformations. This may be done in many different ways; 
but the parity of the number of such transformations depend only on k\ , . . . , k n . If 
this number is even, then e(k\, . . . , k n ) = 1; otherwise, e(k\, . . . , k n ) = —1. There 
are some more explicit formulas for e(ki, . . . , k n ). For example, if n = 2, then 
equation (3.6) becomes (k\ — fc 2 ) 2 — k\ + k 2 and all integral solutions are 

' n(n — 1) n(n + 1) N 

' , — oo < n < oo; 



2 ' 2 

the corresponding e is (—1)™- If n = 3, then 

, , x f 1, if ki + k 2 + k 3 = mod 3, 
e(fci,fc 2 ,fc 3 ) = j _ X) iffcl + fc2 + fc3 ^i mod3 

(the case k\ + k 2 + k 3 = 2 mod 3 is not possible). If n — 4, then 

Ci. h h h \ — / !> if fci + fc 2 + fc 3 + fc 4 = 0,2, 3, 7 mod 8, 
e^i,*2,* 3 ,fc4; - | _ 1; . f A . 1 + fc2 + fc3 + fc4 = 1,4,5,6 mod 8. 

The second formula generalizes the Gauss identity (and also Klein's identity 
and Dyson's identity) to a formula for ip(x) n _1 : 



.A'„ 
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where the summation is taken over the same n-tuples (ki, . . . , k n ) as in the previous 
identity and e(fci, . . . ,k n ) has the same meaning as before. 



John Smith 
January 23, 2010 



Martyn Green 
August 2, 1936 



Henry Williams 
June 6, 1944 



John Smith 
January 23, 2010 



Martyn Green 
August 2, 1936 



Henry Williams 
June 6, 1944 



John Smith 
January 23, 2010 

3.11 Exercises. 

3.1. Prove that 



Martyn Green 
August 2, 1936 



Henry Williams 
June 6, 1944 



The number of partitions 

n = n\ H + rifc (k > 0) 

with < ni < ■ ■ ■ < rik 



The number of partitions 
n = n\ + ■ ■ ■ + rife (fc > 0) 
with < n\ < ■ ■ ■ < rik 
and all rii odd 
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Hint. There is a natural 1 — 1 correspondence between the partitions in the left 
box and the partitions in the right box. The reader is encouraged to guess how it 
works from the following examples: 

1 + 3 + 6 + 10 <-» 1 + 3 + 3 + 3 + 5 + 5 
1+4 + 7+11 <-» 1 + 1 + 1 + 1 + 1 + 7+11 
2 + 4+6 <-» 1 + 1 + 1 + 1 + 1 + 1 + 3 + 3 

3.2. Prove that for any real s > 1 (or a complex s with the real part Re s > 1), 

1 1 1 1 _ / 2 s \ / 3 s \ / 5 s \ / T 

+ 2 T+ 3 I+ 4 I + 5 7 + '"~ V 2 " - 1 / KZ 7 - 
or, shorter, 

1 t— r / p s 



n a IT 



n" \P 

>b—b pG{primcs} 

Remarks. 1. This formula, also due to Euler, is not related directly to the 
subject of this lecture, but its proof strongly resembles the proof of Theorem 3.1, 
and we hope that the reader will appreciate it. 

2. The expression on the left hand side (and hence the right hand side) of 
the last formula is denoted by ((s). This is the celebrated Riemann ^-function. 
A simple trick provides an extension of this function to all complex values of the 
argument (besides s = 1). It is well known that £(— 2n) = for any positive 
integer n. The Riemann Hypothesis (which is, probably, currently the most famous 
unsolved problem in mathematics) states that if ((s) = and s ^ 2n for any 

positive integer n, then Res = -. 

3.3. Prove the formula from Section 3.5: if n — 2 fe2 3 fc3 5 fes ... is a prime factor- 
ization of n, then 



d w = n 



p- 1 

pG {primes} 

3.4. Deduce from the Jacobi identity the following identity involving the Euler 
function ip: 



\ A / i\n 2n 2 +n 



(-!)"» 



Hint. Try z = -y 2 



3.5. Prove that if k, I, m are integers and k + I + m = 1, then — {kl + km + Im) 
is a non-negative integer not congruent to 3 modulo 4. 

Remarks. 1. This is related to the Klein identity for (p(x) 8 . 

2. According to the table in Section 3.8, 250 of the first 500 coefficients of the 
series for ip(x) s are zeroes. This exercise specifies 125 of them. The numbers which 
constitute the remaining 125 ones look chaotic. The reader may try to find some 
order in this chaos. 

3.6. (a) Let q(n) be the number of partitions n — n\ + • • • + n,/. with < 
n\ < ri2 < • • • < Uk-i < rife (if k = 1, this means only that < n). Prove that 
q(n) = p(n — 1) for n > 1. 

(b) Deduce from (a) that p(n) > p(n — 1) for n > 2. 
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3.7. Prove that p(n) < F n where F n is the n-th Fibonacci number (Fq = F\ = 
1, F n = F n _ 1 + F n _ 2 fom>2). 

Hint. Use the Euler identity and Exercise 3.6 (b.) 

3.8. * Let F n (k = 1,2,...) be the Fibonacci numbers (F 1 = 1,F 2 = 2, F n = 
F n -i + F n -2 for n > 3; in contrast to Exercise 3.7, we do not consider Fo). 

(a) Prove that every integer n > 1 can be represented as the sum of distinct 
Fibonacci numbers, n = F kl + • • • + F ks , 1 < k\ < • • • < k s . 

(b) Prove that a partition of n as in Part (a) exists and is unique, if we impose 
the additional condition: fc, — > 2 for 1 < i < s. 

(c) Prove that a partition of n as in Part (a) also exists and is unique, if we 
impose the opposite condition: k\ < 2, fcj — < 2 for 1 < i < s. 

(d) Let K. n be the number of partitions of n as in Part (a) with s even and H n 
be the same with s odd. Prove that \K n — H n \ < 1. 

(e) (Equivalent to (d).) Let 

(1 - a;)(l - x 2 ){l - x 3 )(l - x 5 )(l -x s )--- = l+ gi x + g 2 x 2 + g 3 x 3 + ... 

oo oo 

(or, in the short notation, ]^[(1 — x Fk ) — 1 + ^g„x"). Prove that \g n \ < 1 for 

fe=l n=l 

all n. 

(f) (Generalization of (e).) Prove that for every k,£ > k, all the coefficients of 
the polynomial (1 - x Fk )(l - x^ 1 ) ...(l-x Fe ) equal or ±1. 

(g) (An addition to (e).) Prove that, for any k > 4, 

g n = for 2F k - 2 < n < 2F k + F fe _ 3 . 
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Figure 3.2. A machine for computing p(n) 
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LECTURE 4 

Equations of Degree Three and Four 



4.1 Introduction. The formula X1.2 = 



-p± yjp 2 -4q 



for the roots of a qua- 



dratic equation x 2 +px + q = is one of the most popular formulas in mathematics. 
It is short and convenient, it has a wide variety of applications, and everybody is 
urged to memorize it. 

It is also widely known that there exists an explicit formula for solving cubic 
equation, but students are, in general, not encouraged to learn it. The usual expla- 
nation is that it is long, complicated and not convenient to use. These warnings, 
however, are not always sufficient to temper one's exploration mood, and some 
people are looking for this formula in various text books and reference books. Here 
is what they find there. 



4.2 The formula. We shall consider the equation 
(4.1) x 3 +px + q = 0. 

(The general equation x 3 + ax 2 + bx + c can be reduced to an equation of this form 
by the substitution x = y — — : 



x 3 + ax 2 + bx + c ■■ 



(y-l) 3 + -(y-l) 2 + b(y-l) + , 



v + ( b - y I f/ 



2a 3 ab 



2a 3 ab 



+ c 



which is y + px + q with p = b — , q = — — + c) 



3 ' * 27 3 
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The formula is 1 




What we see is that this formula is not long and not complicated. The two 
cubic roots are very similar to each other: memorize one, and you will remember 
the other one. The denominators 2, 4, and 27 are also easy to memorize; moreover, 
it is possible to avoid them if you write the given equation as 

x 3 + 3rx + 2s = 0; 

then the formula becomes x = y/—s + \fr z + s 2 + y/—s — \fr z + s 2 . So, maybe, 
this formula is not as bad as most people think? To form our opinion, let us start 
with the most simple thing. 

4.3 The proof of the formula. 

p 3 q 2 

Theorem 4.1. If — + — > 0, then (4.2) is a solution of the equation (4.1). 

— i *± 

Proof. Let 




Then A 3 + B 3 = -q, 

a d ;V 1 , /V i I 2 ? 1 Ip 3 i I 2 3 / P 3 P 
AS= V-2 + V2T + T-V-2 V2T + T = V 27 = _ 3' 

and 

a; 3 = (A + B) 3 = A 3 + 3AB(A + B) + B 3 = -px - q, x 3 +px + q = 0, 
as required. □ 

4.4 Let us try to use the formula. If this formula is good, it should be 
useful. Let us try to apply it to solving equations. 

Example 4.1. Consider the equation 

x 3 + 6x - 2 = 0. 

According to the formula, 

x= + V8TT+ \jl - V8TT= ^4-^2. 

This result is undoubtedly good: without the formula, we should have hardly been 
able to guess that this difference of cubic radicals is a root of our equation. 



1 This formula is usually called the Cardano Formula or the Cardano-Tartaglia Formula. The 
reader can find the dramatic history of its discovery in S. Gindikin's book [35]. 
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Example 4.2. Consider the equation 

x 3 + 3x - 4 = 0. 

According to the formula, 

x = \j2 + VT+l+ ^2- vT+4= ^2 + \/5+ 

Not bad. But if you use your pocket calculator to approximate the answer, you 
will, probably, notice that The best way to prove this 

is to plug the left hand side and the right hand side of the last equality into the 
equation x 3 + 3x — 4 = to confirm that both are solutions, and then prove that 
the equation has at most one (real) solution (the function x 3 + 3x — 4 is monotone: 
if x\ < £2, then x\ + 3xi — 4 < x\ + 3x 2 — 4). 

This casts the first doubt: the quadratic formula always shows whether the 
solution is rational; here the solution is rational (even integral), but the formula 
fails to show this. 

Example 4.3. To resolve our doubts, let us consider an equation with the 
solutions known in advance. By the way, the coefficient a of x 2 in the equation 
x 3 + ax 2 + bx + c — equals minus the sum of the roots; so, for our equation (4.1) 
the sum of the roots should be zero. Let us take x\ — — 3,a: 2 — 2,x 3 = 1. The 
equation with these roots is 

(a; + 3) (a; - 2) (a; - 1) = x 3 - 7x + 6 = 0. 

Solve it using our formula: 

Nothing like -3,2, or 1. Too bad. 

Conclusions. The formula is simple and easy to memorize, but it is somewhat 
unreliable: sometimes it gives a solution in an unsatisfactory form, sometimes it 
does not give any solution. Let us try to locate the source of these difficulties. 

4.5 How many solutions? The question is very natural. Our formula gives, 
at best, one solution, whereas a cubic equation may have as many as 3 (real) 
solutions (see Example 4.3 above). 

Consider the graph of the function 

y = x 3 + px + q. 

The graph of y — x 3 is the well-known cubic parabola (Figure 4.1); when we add px, 
the graph will be transformed as shown on Figure 4.1, and it will look differently 
for p > and p < 0. Finally, the graph of y — x 3 +px + q may be obtained from one 
of the graphs of Figure 4.1 by a vertical upward or downward translation (Figure 
4.2). We sec the following. If p > 0, then the number of solutions is always 1. If 
p < 0, then the number of solutions is 1, 2, or 3. Let us learn how to distinguish 
between these cases. 
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Figure 4.1. Cubic parabolas 




Figure 4.2. The number of roots 



Lemma 4.4. The equation 



x 3 + px + q 



p 3 q 2 

has precisely 2 solutions if and only if p < and — + — = 0. 

Remark 4.5. This result, with a proof different from the one below, is discussed 

p 3 q 2 

in Lecture 8. Let us recall that the expression — + — , crucially important for our 
current purposes, is called the discriminant of the polynomial x 3 + px + q. 

Proof of Lemma. To have two solutions, the equation has to have a multiple 
root. If this root is a, then the third root should be —2a, since the sum of the roots 
is 0. In particular, (otherwise, there is only one root, 0). Hence 

x 3 + px + q = (x - a) 2 (x + 2a) = x 3 - 3a 2 x + 2a 3 , 

p 3 q 2 27a 6 4a 6 

p = —3a 2 , q = 2a 3 . In this case p < and 1 = 1 = 0. Con- 

27 4 27 4 

versely, if p < and — + — = 0, then we take a = a - and deduce q = 2a 3 , p = 



^/_^! = ^2^=3a 2 , hence 

x 3 + px + q = x 3 - 3a 2 x + 2a 3 = (x - a) 2 (x + 2a) 
which has a multiple root a. □ 

Consider now a general equation x 3 + px + q = with p < and without 
multiple roots. Obviously, there are two different numbers r such that the equation 
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x 3 + px + q = r has a multiple root (see Figure 4.2). If these two r's have the same 
sign (that is, their product is positive), then the equation x 3 + px + q = has one 
solution; if they have opposite signs (their product is negative), then the number 
of solutions is three. 

Let us make computations. According to the Lemma, x 3 + px + q = r has 2 
solutions if and only if 

p 3 (q — r) 2 



27 + ^ = °' 



that is, 



4p3 
""27' 



r = q ± 

The product of the two values of r is 

2 Ap 3 ( p 3 q 2 
q + — = 41— + — 
' 27 V 27 4 

We thus get the following result. 

Theorem 4.2. The equation x 3 + px + q = has 
p 3 q 2 

1 solution, if — + — > (or p = q = {)); 

p 3 q 2 

2 solutions, if — + — = (and p < 0); 

p 3 q 2 

3 solutions, if — H — - < 0. 

27 4 

4.6 Back to the formula. Since the expression — + — appears both in 

Theorem 4.2 and formula (4.2), there arises a link between these two results which 
explains fairly well the experimental observations of Section 4.4. 

Theorem 4.3. If equation (4-1) has only one real root (or two real roots), then 
the right hand side of formula (4-2) is defined (is the sum of two cubic roots of real 
numbers). If the equation (4-1) has three (different) real roots, then the right hand 
sum of formula (4-2) is undefined: it is a sum of cubic roots of complex numbers. 

4.7 The case of negative discriminant. To apply formula (4.2) to this 
case, we need to learn how to extract cubic roots of complex numbers. Let us try 
to do it. Our problem: given a and b, find x and y such that 

(x + iy) 3 = a + ib. 

The last equation yields the system 

a; 3 — 3xy 2 = a, 
3x 3 y -y 3 = b, 

which can be reduced to the equation 

27b 3 x 3 = (x 3 - a)(8x 3 + a) 2 . 

The latter is a cubic equation with respect to t = x 3 , and it must have three real 
solutions (since our initial problem has three solutions); thus, our formula will not 
help to solve it. 
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There is a different approach to extracting roots of complex numbers based on 
de Moivre's Formula 



(6 6 
cos - + i sin - 
o o 



So, we can use trigonometry to solve cubic equations with the help of formula (4.2). 
Trigonometry, however, can be used for solving cubic equation without any formula 
like (4.2). 



4.8 Solving cubic equations using trigonometry. In trigonometry, there 
is a formula for the sine of a triple angle: 

sin3# = 3sin6>-4sin 3 6». 



Thus, if our equation is 



or 



Ax 3 - 3a; + sin 30 = 0, 
3 sin 30 



x - -.x + 



= 0, 



4 4 

then the solution is x — sin 9. In other words, if p 
equation (4.1) is 



4' 



then the solution of 



x = sin sin 1 (4q) 



What if p ^ — -? In this case, we can make the substitution x = ay. The equation 
(4.1) becomes 

,33 



or 



p 3 
Thus, if — = — , that is, a 
a z 4 



aV + apy + q = 0, 



. 3 + 4^+4 = o- 

a z a 6 



4p 



— , then the solution is 
3 



y = sin | — sin 



4q 



(4.3) 



x = ay 



4p . 1 . _! 9q 4p 

y sin U sin v 



Isn't this a formula? Well, certainly, p should be negative. But also the argu- 
ment of sin -1 should be between —1 and 1: 



9q 









27q 2 
4p3 



<1 - 81g2 - 4p <l 

- ' 16p 4 -3 S ' 



> -1, 27q 2 < -4p 3 , 



27q 2 + 4p 3 < 0, V — + ^ < 0. 
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We see that formula (4.3) works precisely when formula (4.2) does not work, so, 
together, formulas (4.2) and (4.3) cover the whole variety of cases. By the way, 
formula (4.3) always gives 3 solutions: if 



-i 9« / Ap 

Sill 

then the 3 solutions are 



4p sin [\ (a + 2kn) ) , k = 0, 1, 2. 



3 V 3 

4.9 Summary: how to solve cubic equations. Formula (4.2) (together 
with formula (4.3)) always expresses the solutions of equation (4.1) in terms of p and 
q. For practical purposes, this formula may be not very useful; approximate values 
of roots of cubic equations can be found by other methods (used, in particular, by 
pocket calculators) ; and it does not seem likely that one can use such a formula on 
the intermediate steps of calculations (plugging solutions of cubic equations into 
other equations). The significance of formula (4.2) is mainly theoretical, and we 
shall discuss this aspect below. Now we turn our attention to equations of degree 
four. 

4.10 Equations of degree 4: what is so special about the number 4? 

Equations of degree 4 can be reduced to equations of degree 3. This phenomenon 
has no direct analogies for equations of degree greater than 4, and by this reason 
deserves a special consideration. 

What is so special about the number 4? Of many possible answers to this 
question, we shall choose one which will be technically useful to us. 

Among different mathematical problems, there are so-called "combinatorial 
problems". They look like this: "given such and such number of such and such 
things, in how many ways can we do such and such thing?" 

For example: "In a class of 20 students, in how many ways can a president and 

two vice-presidents be selected?" Answer: 20 x — - — = 3,420. 

Or: "In how many ways can one choose 2 green balls and 3 red balls from a 

box containing 10 balls of each color?" Answer: — - — • = 5,400. And so 

on. 

We can observe that the answers are relatively big, usually, substantially bigger 
than the number given in the statement of the problem. Do you know a combina- 
torial problem where the answer is less that the given number (and, say, greater 
than 1)? We know one such problem. 

Problem: in how many ways can one break a set of 4 elements into 2 pairs? 
Answer. 3 (if the set is {ABCD}, then the solutions are AB/CD, AC/BD, 
AD/BC). 

Surprisingly, this simple problem provides the main idea for solving equations 
of degree 4. 

4.11 The auxiliary cubic equation. Let 

(4.4) x 4 + px 2 + qx + r = 

be our equation (precisely as in the cubic case, we can get rid of the "second leading 
term" with x 3 by making a substitution of the form x = y + a). Let x\, x 2 , x 3 , x 4 
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be the solutions of equation (4.4). Then 

x 4 + px 2 + qx + r = (x — X\)(x — x 2 )(x — x 3 )(x — X4), 

whence 

= x\ + x 2 + x 3 + x 4 , 

P = XlX 2 + X1X3 + X1X4 + X 2 X 3 + X 2 X 4 + X 3 X 4 , 

-q = x x x 2 x 3 + x x x 2 x A + x x x 3 x A + x 2 x 3 x 4 , 
r — x\x 2 x 3 Xi. 
Now, let us look once more at our problem and set 

2/i = Oi + x 2 ){x 3 +x 4 ), 
Vi = {xi + x 3 )(x 2 + X4), 
y 3 = (xi + x 4 )(x 2 +x 3 ). 

Notice that, since xi + x 2 + x 3 + £4 = 0, we can also write 

Vi = ~{xi +x 2 ) 2 = -(x 3 +x 4 ) 2 , 
2/2 = ~{xi +x 3 ) 2 = -(x 2 +x 4 ) 2 , 
y 3 = -(xi + Xi) 2 = -(x 2 + x 3 ) 2 . 

Let 

(4.5) y 3 + ay 2 + by + c=0 

be the cubic equation with the roots yi,y 2 ,y 3 . Then 

a = -2/1-2/2-2/3, 
b = 2/12/2 + 2/12/3+2/22/3, 
c = -2/12/22/3- 

4.12 How to express a, b, c via p, q, r? 

Theorem 4.4. a = -2p, b = p 2 — 4r, c= q 2 . 

Proof, (direct computation). It is slightly easier with a and c and longer with 

6. 

a = -2/1 - 2/2 - 2/3 = (xi + x 2 ) 2 + (xi + x 3 ) 2 + (x 2 + x 3 ) 2 
= 2{x\ + x\ + x\ + XlX 2 + XxX 3 + x 2 x 3 ) 
= x\ + x% + x\ + (xi +x 2 + x 3 ) 2 = x\ + x% + x\ + (-Xi) 2 
= (x 1 +x 2 + x 3 + x t ) 2 -2p= -2p. 

c = -2/12/22/3 = (xi + x 2 ) 2 (xx + x 3 ) 2 (x 1 + Xi) 2 
= [(xi + x 2 )(x 1 + x 3 )(x x + Xi)} 2 

= [xf + x\(x 2 + x 3 + x 4 ) + x x (x 2 x 3 + X 2 X A + X 3 X 4 ) + X 2 X 3 Xi} 2 
= \x\-x\ - q) 2 = q 2 . 

b = 2/12/2 + 2/12/3 + 2/22/3 

= (xi + x 2 ) 2 (xi + x 3 ) 2 + (xi + x 2 ) 2 (xi + Xi) 2 + (xi + x 3 ) 2 (xi + Xi) 2 

= x\ + 2x\(x 2 + x 3 ) + x\(x\ + xj+ 4x 2 x 3 ) + 2x 1 x 2 x 3 (x 2 + x 3 ) + x\x\+ 
x\ + 2x\(x 2 + x 4 ) + x\(x\ + x\ + 4x2X4) + 2x!X2X4(x 2 + x 4 ) + x\x\+ 
x\ + 2x\ (x 3 + Xi) + x\(x\ + x\ + 4x 3 x 4 ) + 2xix 3 x 4 (x 3 + x 4 ) + x§x| 

= x\ + 2x\(x 2 + x 3 ) + x\[x\ + x\ + 4x 2 x 3 ) - 2xix 2 x 3 (xi + x 4 ) + x 2 x§+ 
x\ + 2x\ (x 2 + x 4 ) + x\{x\ + x\ + 4x 2 x 4 ) - 2xix 2 x 4 (xi + x 3 ) + x 2 x 2 + 
x\ + 2xf(x 3 + x 4 ) + x\ (x 3 + x| + 4x 3 x 4 ) - 2xix 3 x 4 (xi + x 2 ) + x§x| 

= 3x 4 + 4x?(x 2 + x 3 + x 4 ) + 2x\(x 2 + x 3 + x 4 ) 2 

—2x\(x 2 x 3 + x 2 x 4 + x 3 x 4 ) — 6xix 2 x 3 Xi + x\x\ + x\x\ + x\x\ 

= x\ — 2x\(x 2 x 3 + x 2 Xi + x 3 x 4 ) — 6xix 2 x 3 x 4 + x\x\ + x\x\ + x\x\. 
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p = —x\ + X2X3 + x 2 x 4 + x 3 x 4 ; 
p 2 =x\ — 2x 2 (x 2 x 3 + x 2 x 4 + x 3 x 4 ) + (x 2 x 3 + x 2 x 4 + x 3 x 4 ) 2 ; 
p 2 -b = (x 2 x 3 + x 2 x 4 + x 3 x 4 ) 2 - (x 2 x 2 + x\x\ + x§x 2 ) + 6x1X2X3X4 
= 2^2X3X4 + X2X3X4 + X2X3X4) + 6x1X2X3X4 

= 2X2X3X4^2 + X 3 + X 4 ) + 6X1X2X3X4 

= —2x1X2X3X4 + 6x1X2X3X4 = 4x1X2X3X4 = 4r. 



This proof is convincing but it does not reveal the reasons for the existence of 
an expression for a, 6, c via p, q, r. Let us try to explain these reasons. If we plug the 
(very first) formulas for 2/1,2/2, 2/3 into the definition of a, 6, c, then a, b, c will become 
polynomials in xi,X2,X3,X4 (of degrees 2, 4, 6). Moreover, these polynomials in 
xi , X2 , X3 , X4 are symmetric which means that if you switch any two variables X{ , Xj , 
the polynomial will remain the same. Indeed, if you, say, switch xi with X2, then 
yi remains unchanged while y 2 will be switched with j/3; something similar will 
happen if you switch any two x's. But a, b, c remain unchanged, since they are, 
obviously, symmetric with respect to the y 's. 

There is a theorem in algebra (not difficult) stating that any symmetric polyno- 
mial in xi, x 2 , x 3 , X4 can be expressed as a polynomial in the "elementary symmetric 
polynomials" 

ei = xi + x 2 + x 3 + x 4 , 

e 2 = xix 2 + X1X3 + X1X4 + x 2 x 3 + x 2 x 4 + X3X4, 
e 3 = xix 2 x 3 + xix 2 x 4 + X1X3X4 + X2X3X4, 
64 = X1X2X3X4. 

(A similar theorem holds for any number of variables.) Since e\ — 0, a, b, c are 
polynomials in e 2 , e 3l ei, that is, in p, q, r. Since the degrees of p, q, r are 2, 3, 4, one 
should have 

a = Ap, 

b =Bp 2 + Cr, 

c = Dp 3 + Eq 2 + Fpr, 

and we can find A, . . . , F by plugging particular values for x\, x 2 , X3, X4 (such that 
xi + X2 + X3 + X4 = 0). For example, if xi = l,x 2 = — 1,X3 = X4 = 0, then 
yi = 0, 2/2 = 2/3 = = -1, Q = r = 0,a = -2,6 = 1, c = 0. Hence 

-2 =A-(-l), 
1 =B-l + C-0, 
=D-l + E-0 + F-0, 

whence A = 2,B = 1, D = 0. Similarly, we can find C, E and F. 

4.13 How to express x\, x 2 , X3, X4 via yi,y 2 , J/3? Thus, given equation (4.4) 
of degree 4, we can compose the auxiliary equation (4.5) of degree 3, solve it, and 
find z/i,j/2,2/3- How to find our initial unknowns, xi,x 2 ,x 3 ,X4? It is easy: since 



2/1 

2/2 
2/3 



-(xi +x 2 ) 2 
-(xi +x 3 ) 2 
-(xi +x 4 ) 2 



-(X2 + X3) 2 , 

-(x 2 + x 4 ) 2 , 
-(x 2 +x 3 ) 2 , 
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and xi + X2 + x% + X4 = 0, wc have 



xi + x A = -(x 2 + x 3 ) = ± y /-y 1 , 
x\ + x 3 = -(x 2 + Xi) = ±^ -y 2 , 
xi + x A = -(x 2 + x 3 ) = ±y=y^, 



3a;i + x 2 + x 3 + x 4 = 2x 1 = ±V^Vi ± \f = V2 ± V^V3, 

±y/~Vi ± \J-Vi ± \J-Vz 
xx = 2 • 

Formulas for x 2 ,x 3 , X4 are absolutely the same. Varying the signs in the expression 
±y/—yi, ±y?—y 2l i\/—y~3, we obtain 8 numbers, which are ±xi,±x 2 , ±x 3l ±X4. 
This completes solving equation (4.4). For the reader's convenience, we shall repeat 
the whole procedure, from the beginning to the end. 



4.14 Conclusion: solving equation (4.4). 1. Given an equation 

x A + px 2 + qx + r = 0. 
2. Solve the auxiliary equation 



y 

let yi,j/2,2/3 be the solutions. 
3. Consider the eight numbers 



2py + (p 2 - Ar)y + q 2 = 0; 



±V-yi ± V-yi ± V-y3 



Four of these numbers are solution of the given equation, the remaining are "minus 
solutions." Plug and select the solutions. 



John Smith 
January 23, 2010 



Martyn Green 
August 2, 1936 



Henry Williams 
June 6, 1944 



4.15 Exercises. 
4.1. The equation 

x 3 + 9x + 26 = 

can be solved explicitly by formula (4.2) with all square and cubic roots being 
integers: 

x = S/-13 + \/27+ 169 



= ^-13+14+^-13-14 = V1 + v^27= 1 - 3 = -2. 

Find infinitely many cubic equations with non-zero integral coefficients p and q with 
the same property. 
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4.2. Prove that the formula 

P 

x = r 

3r 



where r is an arbitary value of the cubic root \l — — + s where, in turn, s is an 



/ p 3 q 2 

arbitrary value of y — + — gives precisely 3 complex solutions of a cubic equation 

p3 ^2 

x 3 + px + q = with complex coefficients p, q such that p^O, 27 + ^ ^- 

4.3. Prove that if a cubic equation x 3 + px + q = (with real coefficients) has 
a double (not triple!) root, then formula (4.2) gives the other (not double) root, 
and it is equal to y/—Aq. 

4.4. For the hyperbolic sine, sinh a = , there is a formula 

sinh 3a = 3 sinh a + 4 sinh 3 a. 

Use this formula as in Section 4.8, and find a hyperbolic-trigonometric formula for 
the solutions of a cubic equation x 3 + px + q = with real coefficients. For which 
p, q does it work? How many solutions does it provide? 

4.5. Solve, using the procedure in Section 4.14, the following equations of degree 

4: 

(a) x 4 + Ax + 3 = 0; 

(b) x 4 + 2x 2 + Ax + 2 = 0; 

(c) x A + 480a; + 1924 = 0. 

Remarks. It is easy to solve the equation (a) by the usual high school method 
of guessing, plugging and dividing; it is given here because it also provides a good 
illustration to our method. However, it is permissible to apply the high school 
method described above to the auxiliary cubic equations arising from equations (b) 
and (c). In the latter case it is not easy to guess a root; for a desperate reader who 
fails to guess, we provide a clue: try —100. 

4.6. Find the solutions of the equation 

x A + px 2 + qx + r = 

following the lines of Sections 4.12, 4.13 with 

Vi = xix 2 + x?,x A 
y 2 = x x x 3 + x 2 x 4 
y 3 = xiXi + x 2 xz. 

4.7. Let m, n, k be integers such that rank is an exact square. Find an equation 
x 4 + px 2 + qx + r = with rational p, q, r for which ^fm + ^fn + \f~k~ is a root. 

Hint. Find an equation of degree 4 for which the auxiliary cubic equation is 
(a; + m)(x + n)(x + k) = 0. 




LECTURE 5 

Equations of Degree Five 

5.1 Introduction. In Lecture 4 we presented "radical" formulas solving equa- 
tions of degrees 3 and 4. These formulas express the roots of polynomials of degree 
3 and 4 (plus, possibly, some extraneous roots) in terms of the coefficients of these 
polynomials. More precisely, the roots can be obtained from the coefficients by 
the operations of addition, subtraction, multiplication, and extracting roots of ar- 
bitrary positive integral degrees. Our goal in this lecture is to prove that no such 
formula can exist for polynomials of degree 5 and more. 

The first result of this kind was obtained in 1828 by Niels Henrik Abel, who 
found an individual polynomial of degree 5 with integral coefficients such that no 
root of this polynomial can be obtained from rational numbers by the operations 
listed above. A general theory explaining such phenomena was created approxi- 
mately at the same time by Evariste Galois. (Unfortunately, the work of Galois, 
who died at a very young age in 1832, became broadly known to the mathematical 
community only 50 years after his death.) The theorem which we shall prove here 
does not deal with any individual equation: it studies the dependence of the roots 
of a polynomial on the coefficients; in particular, we shall not care about the ratio- 
nality or irrationality of the coefficients. The proof will be geometrical, although it 
is based (in an implicit way) on the ideas of Galois theory. 



5.2 What is a radical formula? Let us start with a quadratic equation, 
(5.1) x 2 +px + q = 0. 

The solutions are expressed by the formula 



-p ± y/p 2 - 4q 
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We can describe the procedure of finding roots avoiding the awkward symbol " . 
Instead, we write the sequence of formulas 

x\ = p 2 - 4q, 
1 1 

Starting with p, q, we find X\, then x 2 , and x 2 will be a solution. Since x\ is not 
unique, x 2 is also not unique: we find all solutions of the equation (5.1). 



Turn now to a cubic equation, 




(5.2) z 3 H 


- px + q = 0. 


Again, we write a chain of formulas, 




x 1 


p 3 q 2 
27 4 : 


x\ 


= -| + *i 


x 3 


q 

= 2 


X4 


= x 2 + x 3 . 



We have two values for x 1; then three values for each of x 2 ,x 3 . Seemingly, we have 
36 values for X4, but actually only 9 of them may be different. The solutions of 
equation (5.2) are three of them (the other 6 will be the roots of the polynomials 

1 y/3 

x 3 + e 3 px + q, x 3 + s^px = where £3 = — - + —^-i is "the primitive cubic root of 

In a similar way, we can present the solutions of equations of degree 4 (see 
Exercise 5.1). Now, we can give a precise definition of a "radical formula". We say 
that the equation 

(5.3) x n + a 1 x n ~ 1 A \-a n - 1 x + a n = 

(with variable complex coefficients ai,...,a n ) is solvable in radicals if there exist 
polynomials Pi, ■ ■ ■ ,Pn ( m n, n + 1, . . . , n + N — 1 variables) and positive integers 
fci, . . . , fcjv such that for any (complex) root x = xpj of the polynomial (5.3) with 
given 01, . . . , a n there are (complex) numbers x\, . . . , xm satisfying the system 

x k i =Pi{ai,...,a n ), 
x 2 2 =p 2 {a 1 ,...,a n ,x 1 ), 



x k N =PN{ai,---,a n ,x 1 ,...,x N - 1 ). 

We shall apply this definition also in the case when equation (5.3) contains (like 
equation (5.2)) fewer than n variable coefficients. 

5.3 Main result. 

Theorem 5.1. The equation 
(5.4) x 5 ~x + a = 

is not solvable in radicals. 
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The goal of this lecture is to prove this theorem. 

This theorem will imply that the general equation (5.3) with n > 5 is also 
not solvable in radicals. (Indeed, if equation (5.3) with some n > 5 is solvable in 
radicals, then the same is true for the equation x n — x n ~ A + ax n ~ 5 = x n ~ 5 (x 5 — 
x + a) = 0; then the equation a; 5 — x + a = is also solvable in radicals.) 

Before we start proving Theorem 5.1, we shall make a general remark. The 
proof may seem unusual to many people. Instead of directly dealing with radical 
formulas, we shall analyze in some details things visibly unrelated to our goal. And 
when some readers may start feeling irritated with this abundant "preparatory 
work" , we shall find out that the proof is over. 

5.4 Number of roots. 

4 4 

PROPOSITION 5.1. If equation (5.4) has multiple roots then a 4 = — , in other 



words, 

4 4z 
a = ± — -= or ± 



"5^5 5^5" 

Lemma 5.2. If b is a multiple root of the equation (5.4) then 5b 4 = 1. 

Proof of Lemma. . If & is a multiple root of the polynomial x 5 — x + a, then 
x 5 — x + a = (x — b) 2 p(x) where p is a polynomial of degree 3. Take x = b + e where 
£ is a very small number. Then 

(b + e) 5 - {b + e) + a = e 2 p(b + e), 

b 5 + 5eb 4 + e 2 (106 3 + I0b 2 e + bbe 2 +e 3 )-b-e + a = e 2 p{b + e). 

Delete 6 5 — b + a = and divide by s: 

56 4 - 1 = e(p(b + e) - 106 3 - I0b 2 e - bbe 2 - e 3 ). 

This is true for any e, but the right hand side will be arbitrarily small (in absolute 
value) if e is small. Hence, 56 4 — 1 is "arbitrarily small," that is, 56 4 — 1 = 0. □ 

Proof of Proposition. . If 56 4 = 1, then a 4 = (b - 6 5 ) 4 = fe 4 (l - 6 4 ) 4 = 

1 4 4 4 4 

= _. □ 

5 5 5 5 



5.5 Variation of a. If a = then the equation is x 5 — x = 0, and the solutions 
are 0, ±1, ±i. If we vary a then the 5 roots will also vary, but they will not collide, 

4 4« 

if a avoids the dangerous values ± ^= ,± jj= (Figure 5.1). 

5V5 5y5 

What happens, if a traverses some closed path ("loop") starting and ending 
at (and avoiding the dangerous values)? The five roots 0, ±l,±i of the equation 
x 5 — x = will come back to 0, ±1, ±i; but will each individual root return to its 
initial seat? No! The roots, in general, will interchange their positions (Figure 5.2); 
moreover, they can do it in an arbitrary way. We will prove it below, but before 
we even make a rigorous statement we have to talk a little about permutations. 
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Figure 5.1. A variation of a yields a variation of roots 




FIGURE 5.2. A loop variation of a yields a permutation of roots 

5.6 Permutations. We shall be interested only in permutations of the set of 5 
elements which we will denote by 1,2, 3, 4, 5 (so the word "permutation" will always 
refer to a permutation of this set). A notation for a permutation is {i^i^Ah) where 
h, h, «3, U, *5 are different integers between 1 and 5. The notation above means that 
the permutation acts as 

Ihjj, 2 ! 2 , 3^J3, 4 i ► Z4 , 5 1 ^ i§. 

We shall usually present the permutation by diagrams like the one in Figure 5.3 
where the arrows indicate the images of 1, 2, 3, 4, 5; for example, the permutation on 
Figure 5.3 is (41352). (The arrows on a figure like Figure 5.3 are usually assumed 
straight, but we may want to deform them a little to avoid triple intersections; for 
this reason, the arrow 3 — > 3 on Figure 5.3 is not genuinely straight.) 
The total number of permutations is 120. 

If we successively perform two permutations, first a and then (3, we get a new 
permutation, which is called the product of permutations a and j3 and is denoted 
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1 2 3 4 5 




1 2 3 4 5 



Figure 5.3. A permutation 

by a(3. For example, 

(21435)(13254) = (31524), 
(13254) (21435) = (24153) 

(which shows that the product may depend on the order of the factors.) For every 
permutation, a, there is the inverse permutation, or 1 which is obtained from a. 
by reversing the arrows. The products aa^ 1 and a~ 1 a are both equal to the 
identity permutation e = (12345). Note that both products and inversions can be 
visualized by means of diagrams like Figure 5.3: to find a product, we need to draw 
the second factor under the first one, to find an inverse permutation, we need to 
reflect the diagram of the permutation in a horizontal line. This is illustrated by 
Figure 5.4 where the equalities (41352)(21354) = (52341) and (41532)" 1 = (25413) 
arc demonstrated. 




FIGURE 5.4. Operations on permutations 



For a permutation (1112131415) one can count "the number of disorders," that is, 
the number of pairs s, t with 1 < s < t < 5, i s > i t . Thus the number of disorders 
varies from (the identity permutation (12345) has disorders) to 10 (the reversion 
permutation (54321) has 10 disorders). The permutation (41352) has 5 disorders 
(4 > 1, 4 > 3, 4 > 2, 3 > 2, 5 > 2). The best way to count disorders is to use a 
diagram like Figure 5.3: disorders correspond to crossings on this diagram (this is 
why we wanted to avoid triple crossings). 

It is not the number of disorders, but rather its parity, that has a major signifi- 
cance in the theory of permutations. A permutation is called even, if the number of 
disorders is even, and is called odd, if the number of disorders is odd. For example, 
(12345) and (54321) are even permutations, while the permutation (41352) is odd. 

Proposition 5.3. The product of two permutations of the same parity is even. 
The product of two permutations of the opposite parities is odd. 
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Proof. We use the description of the product of two permutations as presented 
on Figure 5.4. For every s = 1, 2, 3, 4, 5, there is a two-edge polygonal paths starting 
at the point s of the upper row and going downward (see Figure 5.5). For two such 
paths, starting at s and t, there are 3 possibilities: (1) they do not cross each 
other either in the upper half of the diagram, or in the lower part; (2) they cross 
each other either in the upper half or in the lower part, but not in both; (3) they 
cross each other both in the upper half and in the lower part of the diagram. The 
total number of the crossings of the two paths in the two halves of the diagram is, 
respectively, 0, 1, and 2. The number of crossings of the same paths in the product 
diagram (where the paths are straightened) is, respectively, 0, 1, and 0. We see, 
that the parities before and after the paths being straightened are the same. □ 



s t s t s t 




FIGURE 5.5. Proof of Proposition 5.3 



Corollary 5.2. For every permutation a, the permutations a and a 1 have 
the same parity. 

Proof. This fact (obvious directly) follows from the equality aoT 1 = e.O 

Corollary 5.3. There are precisely 60 even permutations and 60 odd permu- 
tations. 

Proof. Let cti, . . . , a at be all even permutations, and let 7 be some odd per- 
mutation (say 12354)). Then the permutations [3\ = ja\, . . . , /3 n = ja^ are all 
odd and are all different: if 7a = 7a', then 7 _1 7a = 7~ 1 7a', that is, a = a 1 . 
Moreover, every odd permutation is among the Pi's: if is odd, then 7~ 1 /3 is even 
and = 77 _1 /3. Thus the number of even permutations is equal to the number 
of odd permutations, and since every permutation is cither even or odd, and the 
total number of permutations is 120, the number of even permutation, as well as 
the number of odd permutations, equals 60. □ 

Now we shall prove two theorems about permutations which will be used in 
subsequent sections of this lecture. First, we shall prove that every permutation 
can be presented as a product involving only a few very special permutations. The 
special permutations are 

ai = (52341), a 2 = (15342), a 3 = (12543), a 4 = (12354) 

(briefly, ai swaps i with 5 and fixes the rest of the numbers) . 

Theorem 5.4. Any permutation can be presented as a product of ai 's. 
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Proof. Consider a diagram (like Figure 5.3) of the given permutation. Assume 
that no two crossings belong to the same horizontal level, and cut the diagram by 
horizontal lines to pieces such that there is precisely one crossings within each piece 
(Figure 5.6). Then the permutation falls into a product of "elementary transposi- 
tions" 

I3i = (21345), (3 2 = (13245), fa = (12435), f3 4 = (12354) 
(/3j swaps i with i + 1 and fixes the remaining numbers). It remains to notice that 

(3i = a 2 aia 2l [3 2 = a 3 a 2 a 3 , /3 3 = a 4 a 3 a 4 , (3 4 = a 4 

(check this!). □ 




Figure 5.6. A decomposition of a permutation into a product of (3s 

For two permutations, a and (3, their commutator [a, 0\ is defined as afta^ 1 (i^ 1 . 
Clearly, the commutator of any two permutations is an even permutation. 

Theorem 5.5. Any even permutation is a product of commutators of even 
permutations. 





Figure 5.7. Proof of Theorem 5.5 



Proof. Cut the diagram of the given even permutation by horizontal lines into 
pieces containing two crossings each (Figure 5.7). Then our permutation will be- 
come the product of two-crossing permutations. Obviously, there are precisely 9 
such permutations (see Figure 5.8), 71, . . . , 79. 

Each of these permutations is a commutator of two even permutations: 

71 = [(42135), 75] 74 = [73,72] 77 =[72,73] 

72 = [(42351), (14352)] 75 = [(52431), (53241)] 7s = [(53241), (52431)] 

73 = [79, (14352)] 7 6 = [72, 71] 79 = [71,72] 
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Y Y 1 
A A 1 

7i 


Y 1 Y 
A 1 A 

72 


1 Y Y 
1 A A 

73 


74 


75 


WW. 

76 


77 


78 


79 



Figure 5.8. Permutations 7 



(check this!). □ 



Remark 5.4. Theorem 5.4 and its proof are valid for permutations of the 
set of n elements for any n > 2 (certainly, in this case we shall have to consider 
n — 1 permutations a^). However, the statement of Theorem 5.5 is not true for 
permutations of the set of n elements, if n < 5. Actually, this is the reason why 
equations of degree less than 5 can be solved in radicals. 



roots 



00 1 



Figure 5.9. A special variation of a 



5.7 Variation of a and permutations of roots. Consider a closed path 
(a "loop") in the plane of a that starts at 0, goes right along the real axis to 



then encircles this point 



a close proximity of the "dangerous point" ao = 



counterclockwise along a circle of a very small radius, and then returns to along 
the real axis. 

It turns out that the roots of our polynomial x 5 — x + a react to this variation 
of a in the following way (see Figure 5.9). The roots —1 and ±i trace relatively 
small loops and return to their initial positions. On the contrary, the roots and 
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1 approach the point bo = "77= (and, hence, each other, then make clockwise half- 

V5 

rotations around 60, an d then move along the real axis, respectively, to 1 and 0. In 
particular, they swap their positions: goes to 1 and 1 goes to 0. 




x 



Figure 5.10. Two graphs 



Let us explain why. The graph of the function y — x 5 — x is shown on Figure 
5.10, left (it is easy to draw it as the difference of the two well known graphs y = x 5 
and y = x). If we add a > to the function, then the graph goes up, the root — 1 
moves slightly to the left, the roots and 1 move towards each other and almost 
collide (at the point bo), when a approaches ao- The roots ±i remain complex 
conjugated, they never cross the real axis (because if they do, they reach the real 
axis simultaneously and become a double root therein). 

When a encircles ao, the three roots originating from —1 and ±z stay almost 
unchanged; but what happens to the two other roots? Let us take a small (in 
absolute value) complex number e and look for which a the polynomial x 5 — x + a 
has the root 60 + s. This is a matter of an easy computation: 

a = (b + e) - (b + e) 5 

= b Q -bl + e(l - 56(5) - e 2 (106g + lOebf, + 5e 2 6 + e 3 ) 
= a - e 2 (10^ + lOsbl + 5e 2 b + e 3 ) ~ a - 106ge 2 . 

(we used the fact that b — b^ = a and 1 — 56 = 0; the symbol « means an 
approximation with an error much less, in the absolute value, than |e| 2 ). So, when x 
makes a clockwise half-rotation around b a , a makes a full counterclockwise rotation 
around a . And vice versa: when a makes a full rotation around a , x makes a 
half-rotation around bo] as a result, the two roots x close to bo swap their positions. 
This justifies Figure 5.9. 

Now, let us make one more simple observation. The roots of the polynomial 
x 5 — x + ia are obtained from the roots of the polynomial x 5 — x+a by multiplication 
by i. This means that if we turn the left Figure 5.9 counterclockwise by 90°, then 
the right Figure 5.9 will also turn counterclockwise by 90°. We see that the loops, 
similar to the loop on the left Figure 5.9, but encircling iao, — ao, and — iao (instead 
ao), will swap the root 0, respectively, with i, —1, and — i and keep the remaining 
three roots in their positions (see Figure 5.11). 
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Figure 5.11. Swapping the roots 

And one more remark. A composition of two loops (that is, a loop which first 
goes along the first loop and then along the second loop) leads to a permutation of 
roots which is the product of permutations corresponding to the two loops. 

From all this, we can deduce the main result of this section (promised in Section 

5.5) . 

Theorem 5.6. For any permutation of the 5 roots 0, ±1, ±i, there exists a loop 

4 ii 

starting and ending at and avoiding the points ± — ± — jj=, which leads to this 

5 v 5 5 v 5 

permutation. 

Proof. Enumerate the roots in the order 1, i, —1, — i, 0. The construction above 
gives loops which induce the permutations ot\, a 2 , ct%, a 4 (in the notations of Section 

5.6) . Hence, we can find a loop that induces any product of these permutations, 
that is, according to Theorem 5.4, an arbitrary permutation. □ 

5.8 Variation of a and permutations of intermediate radicals. Suppose 
that equation (5.4) can be solved in radicals: 

xl 1 =pi(a), 
x\ 2 =p 2 {a,x 1 ), 

x k N N =p n {ai,x 1 ,...,x N -i), 

and all the solutions of equation (5.4) are contained among the possible values of 
xn. Theoretically, we can have as many as k\k 2 ■ ■ ■ fcv values of xjy, but some of 
them may coincide for all values of a (we observed this phenomenon in the case of a 
cubic equation). Thus we have a "tower" of values of a, x\, . . . , xn, see Figure 5.12. 
(Figure 5.12 presents a schematic picture, which cannot occur in reality: if two 
values of x 2 coincide as shown in this picture then there will be at least two other 
pairs of merging values) . If we vary a then the whole tower begins varying. What 
is important, the values of x N , which are solutions of the equation (5.4), remain 




\ valueso f XN 

Figure 5.12. The tower of values of x { 

One more remark. As we observed earlier, some values of xm (for any M) may 
coincide for all values of a. But there might be also accidental coincidences which 
occur for isolated specific values of a. For example, the equation 

has ki solutions (for X\), if Pi(a) ^ 0; if Pi(a) — 0, there is only one solution 
(xi = 0). So, the roots of the polynomial p\ (but there are only finitely many 
of them) are sites of "accidental coincidences." We have to declare these roots 
"dangerous" (in addition to the 4 dangerous points (Section 5.5). A coincidence in 
the second row happens, if 

p 2 (a,x 1 ) = p 2 (a 1 x' 1 ) 

for two different solutions xi,x[ of the equation .x* 1 = pi(a). The system 

x\ x =pi(a) 
K) fe ^ Pl (a) 
p 2 (a,x 1 ) =p 2 (a,x' 1 ) 

either has finitely many solutions (a, x\,x'{) (in which case we declare the corre- 
sponding values of a dangerous) or has solutions for all a and also some isolated 
solutions (a,xi,x[) (and we declare dangerous the values of a, corresponding to 
these solutions). Proceeding in this way, we declare dangerous a finite collection 
of values of a and in the future consider only loops which avoid all dangerous val- 
ues, old and new. (By the way, it may happen that becomes a dangerous value. 
Then we should consider loops which start and end not at 0, but at some near-by 
non-dangerous point.) 

5.9 Commutators of loops. Let £\,£ 2 be two loops in the plane of a. Con- 
sider the loop [ii,( 2 ] — ii^ii 1 i 2 X which traverses the loop l\, then £ 2 , then i\ in 
the reverse direction, and then t 2 in the reverse direction (Figure 5.13). This loop 
is called the commutator of the loops t\ and i 2 . 
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FIGURE 5.13. The commutator of loops. 



Lemma 5.5. If a loop £ in the plane of variable a is a product of commutators 
of loops (avoiding the dangerous points), then the variation of a along I returns 
each value of x\ to its initial position. 

Proof. For any (non-dangerous) value of a, the k\ values of x\ can be obtained 
from one value as 

xi,xie kl ,xie 2 ki ,. . . ,xi£^ _1 

where e fc = cos h i sin — is the "primitive k^th root of 1" . The ratios of these 

ki ki 

values of x\ remain constant in the process of the variation of a. 

Let £ = l\l 2 l~{ x £ 2 x . Let the variation of a along t\ take x\ to x\e™^ (and, 
hence, take x\S r ki to xie^ mi ) and the variation of a along £ 2 take x\ to Xis™ 2 
(and, hence, take x\S r ki to Xie k ^ m ' 2 ). Then the successive variations of a along 
^,<2,<i , and £ 2 1 transforms x\ according to the rule 

Thus, the variation of a along a commutator of loops takes every value of x x to 
itself, and the same is true for products of commutators. □ 



Lemma 5.6. If a loop £ in the plane of a is 

a product of commutators of 

products of commutators of 
loops (avoiding the dangerous points), then the variation of a along £ returns each 
value of Xi and x 2 to its initial position. 

Proof. Let £ be a commutator of loops £\,l 2 which are, in turn, products of 
commutators. Then, according to Lemma 5.5, the variation of o along each of the 
loops £i,£ 2 takes every value of x\ to itself. Since 

x 2 2 =p 2 (a,x 1 ), 

x 2 must be taken to x 2 e^ for some to. In this case, x 2 e r k2 will be taken to x 2 e k ^ m 
(the ratios between x' 2 s corresponding to the same - varying - value of x\ remains 
constant during the variation). Thus, the successive variations along the loops 
£\-,£iA\ 5^2" transforms x 2 in the following way: 

_ mi mi+mo mi+mo — mi mi+mo — mi— mi 

x 2 >-» x 2 e k2 x 1 > x 2 e k ^ 1 > x 2 e k ^ ^ x 2 e k ^ = x 2 . 

Thus, the variation of a along the commutator of products of commutators of 
loops avoiding the dangerous points takes any value of x 2 (as well as any value of 
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xi) to itself, and the same is true for any product of commutators of products of 
commutators. □ 

Proceeding in the same way, we prove a chain of lemmas ending with 

Lemma 5.7. If a loop I is 

a product of commutators of 
\ products of commutators of 



products of commutators of 
loops ( avoiding the dangerous points ), then the variation of a along I returns any 
values o/xi, . . . ,xjv to its initial position. 

We are fully prepared now for the 

5.10 Proof of main theorem. Suppose that the equation x 5 — x + a = is 
solvable in radicals. Fix some non-identical even permutation ao of roots. Present 
ao as a product of commutators of even permutations, 

a = [ai,a 2 ][a 3 ,a4] . . . [a 2s -i, a 2s \. 

Then present each a, as a product of commutators of even permutations, 

ai = [an, ai 2 ] ■ . . [ai,2t-i, ai,2t] 



And so on, N times. For each permutation ai 1 ...i N occurring in the last, 7V-th, 
step, find a loop £i 1 ...i N avoiding all the dangerous values of a which induces this 
permutation. In the expression of ao via ai 1 ...i N 's replace each permutation oti lm ..i N 
by the corresponding loop ti^...i N ■ We shall obtain a loop which is an iV-fold product 
of commutators of loops (as in Lemma 5.7). 

On the one hand, by Lemma 5.7, this loop returns every value of x^r to its initial 
position, and hence returns each root of the equation (5.4) to its initial position. 
On the other hand, this loop induces the (non-identical) permutation ao of the 
roots. 

This contradiction proves the theorem. □ 



John Smith 
January 23, 2010 



Martyn Green 
August 2, 1936 



Henry Williams 
June 6, 1944 



90 



LECTURE 5. EQUATIONS OF DEGREE FIVE 



5.11 Exercises. 

5.1. Write a chain of formulas, as in the end of Section 5.2, solving the equation 

x 4 + qx + r = 0. 

How many solutions does it have? The extraneous solutions are the roots of some 
other equations. Which ones? 

Remark. We suggest to consider a particular equation of degree 4 above, since 
for the general equation x 4 + px 2 + qx + r = 0, the solution is basically the same, 
but the formulas are much longer. 

5.2. Prove that there are precisely 4 permutations ii) of the set 
{1,2,3,4} which can be presented as products of commutators of even permu- 
tations (actually, they are just commutators of even permutations). Prove this and 
find these 4 permutations. 

5.3. Consider the cubic equation 

x 3 + ax - 1 = 0. 

— I _|_ 

For a = 0, it has 3 roots: 1, e 3 = , and e 3 . 

(a) For which values of a, does our equation have double roots (that is, what 
are the "dangerous values" of a)? 

(b) One of these dangerous values of a is real (and negative) . If a makes a loop 
around this dangerous value starting from (as shown in the third diagram from 
the left in the upper row in Figure 5.11), then what is the resulting permutation of 
the roots? 

(c) Show that any permutation of the roots can be obtained from a loop starting 
from a = 0, avoiding dangerous values of a and returning to 0. 




LECTURE 6 

How Many Roots Does a Polynomial Have? 

Roots of polynomials arc discussed more than once on these pages. The reaction 
of an average student of mathematics (or that of a practicing mathematician) to 
the question in the title of this lecture is that the number of roots of a polynomial 
of degree n does not exceed n. Some would add that if one counts complex roots, 
and counts them with multiplicities, then this number is exactly n (we shall prove 
this Fundamental Theorem of Algebra in Section 6.4). 

The content of this lecture is different: we shall discuss two rather surprising 
facts. The first is that the number of real roots of a polynomial with real coefficients 
depends not on its degree but rather on the number of its non-zero coefficients. The 
second is that, although there are no explicit formulas for roots (see Lecture 5), one 
can determine exactly how many roots any polynomial has on a given segment. 

6.1 Fewnomials. A fewnomial is not a mathematical term. 1 That is the name 
we give to a polynomial 2 of a large degree with only a few non-zero coefficients. A 
typical fewnomial is x 100 — 1 or ax n + bx m . The main property of fewnomials is 
that they have only a few roots. 

Theorem 6.1. A polynomial with k non-zero coefficients has no more than 
2k — 1 real roots. 

Proof. Induction on k. For k = 1, the result is obvious: ax n = has only one 
root, x = 0. 

Let f(x) be a polynomial with fc+1 non-zero coefficients. Then, for some r > 0, 
f(x) = x r g(x) where g{x) still has k + 1 non-zero coefficients, and one of them is 
the constant term. Differentiation kills this constant term, and it follows that g'(x) 
has k non-zero coefficients. By the induction assumption, g'(x) has at most 2k — 1 
roots. 



1 The term was coined by A. Khovanskii. 

2 The polynomials in this lecture, except for the last section, are with real coefficients, and 
we are concerned only with their real roots. 
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Rollc's theorem implies that, between two consecutive roots of a polynomial 
(actually, any differentiable function, sec Figure 6.1), there exists a root of its 
derivative. It follows that the number of roots of g(x) does not exceed 2k. As to 
f(x), its roots are those of g(x) and, possibly, x = 0. Therefore, f{x) has at most 
2k + 1 roots, as claimed. □ 



V = /(a 



Figure 6.1. Rollc's theorem 

The estimate of Theorem 6.1 is sharp: x(x 2 — l)(x 2 — 4) • • • (x 2 — k 2 ) has 2k + 1 
roots 0, ±1, ±2, • • • , ±fc and k + 1 non-zero coefficients. 

6.2 Descartes' rule. Theorem 6.1 becomes too weak if one is interested only 
in positive roots. For example, a polynomial with non-negative coefficients does not 
have positive roots at all! The next, more refined, theorem is called the Descartes 
rule. 

Theorem 6.2. The number of positive roots of a polynomial does not exceed 
the number of sign changes in the sequence of its non-zero coefficients. 

In particular, Descartes' rule implies Theorem 6.1: one applies Descartes' rule 
twice, to the positive and the negative half-lines. 

Proof. Let us examine what happens to the coefficients of a polynomial when 
a new positive root appears. Let f(x) = (x — b)g(x) where b > and g(x) — 
a x n + aix 11 ^ 1 + • • • + a„_ix + a„. The coefficients of f(x) are: 

(6.1) a ,ai - ba ,a 2 - ba\, ■■ ■ ,a n - 6a„_i, -ba n . 
The coefficients 

(6.2) a ,ai, • • • , a n 

of the polynomial g(x) are grouped into consecutive blocks of numbers having the 
same sign (we do not care about zeros); see Figure 6.2 where these blocks are 
represented by ovals. We see that in each place where the sequence at changes 
sign, there arises the same sign as in the right oval. Namely, if at < 0, ak+i > 
then au+i — ba^ > 0, and if > 0, cifc+i < then ak+i — baj~ < 0. Moreover, at 
the beginning of the sequence (6.1), we have the same sign as at the beginning of 
sequence (6.2), and at the end of (6.1), one has the sign opposite to that at the end 
of (6.2). 

It follows that the number of sign changes in the sequence (6.1) is at least 
one greater than that in the sequence (6.2). That is, each new positive root of a 
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X 



+ 



X 



) 



— a. 



7^) (ai 



:+i + + a. 




a i+1 - bcii > 



a i+1 - bcii < 



Figure 6.2. How a new root affects the coefficients of a polynomial 

polynomial increases the number of sign changes in the sequence of its coefficients 
by at least one. 

To finish the proof, write f(x) as (x — b\) ■ ■ ■ (x — bk)g(x) where g(x) does not 
have positive roots. Then the number of sign changes in the sequence of coefficients 
of f(x) is at least that of g(x) plus k, hence, not less than k. □ 

6.3 Sturm's method. In this section we shall explain how to determine 
the number of roots of a polynomial on a given segment. Let f(x) be a poly- 
nomial without multiple roots. We shall construct a sequence of polynomials 
Po(x),pi(x),p2(x) 1 ■ ■ ■ ,p n (x) of decreasing degrees, called a Sturm sequence, that 
enjoys the properties: 



(2) if Pk{t) = then the numbers pk-\{t) and Pk+i(t) are non-zero and have 
the opposite signs; 

(3) the last polynomial p n (x) does not have any roots at all. 3 

Let us call such a sequence of polynomials a Sturm sequence. For a given x, 
let S(x) be the number of sign changes in the sequence Po(x), ■ ■ ■ ,p n {%) (again, 
ignoring zeros). For example, the sequence 2,0,1 has no sign changes whereas 
2, 0, —1 has one. 

To determine the number of roots of f(x) on a segment (a, b) (we do not exclude 
the case when cither a, or b, or both are infinite) one computes S(a) — S(b): this is 
the number of roots of f(x) on the interval (a, b). 

Let us prove that this is indeed the case. As x moves from a to b, the number 
S(x) may change only when a; is a root of one of the polynomials Pi . If x is a root 
of po = f then cither / changes sign from — to +, and then f'(x) > 0, or from + to 
— , and then f'(x) < 0, see Figure 6.3. The first two terms in the Sturm sequence 
change as follows: 



and in both cases, the number of sign changes decreases by 1. 

If a; is a root of a polynomial p^. with < k < n then, according to the second 
property of Sturm sequences, the signs of pk-i(x) and pk+i(x) are opposite. This 
implies that, no matter how the sign of pk changes, the number of sign changes in 



Recall that wc consider only real roots. 



(1) Po(x) = f(x),Pi(x) = f(x); 



(- + •••) h- (+ + •••) or (+-•••) h-( ), 
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f'(x) > 



f'(x) < 



Figure 6.3. The number of sign changes decreases by 1 



the Sturm sequence remains the same: 



(...- + + ...)_►( + ...) or (... + + -...)_►(... + ). 



It remains to construct a Sturm sequence. This is done by long division of 
polynomials. We already know the terms po and p\. To construct Pk+i from p^ 
and pk-i, divide the latter by the former and take the remainder with the opposite 
sign: 



Note that the degree of the next term, Pk+i, is smaller than that of pk', thus, the 
division will terminate after finitely many steps. 

This process is a version of the Euclidean algorithm for finding the greatest 
common divisor of two numbers, discussed in Section 1.9. And indeed, the last 
polynomial, p n (x), is the greatest common divisor of f{x) and f'{x). This implies 
that p n (x) has no roots: if such a root existed it would be also a common root 
of f(x) and f'(x), that is, f(x) would have a multiple root 4 - the case that was 
excluded from the very beginning. 

It remains to verify property (2) of Sturm sequences. Assume that pk(x) = 0. 
We see from (6.3) that pk+i(x) and pk-i(x) have opposite signs, provided they are 
both non-zero. If pk+i(x) = then (6.3) implies that pk-i(x) — as well, and 
iterating the argument, eventually that Pi(x) = po(x) = 0. But this again means 
that / has a multiple root at x which is impossible. 

To summarize, (6.3) is an algorithm for constructing a Sturm sequence. Let us 
work out an example. 

Let f(x) = x 5 — x + a, the main character of Lecture 5. The Sturm sequence 
consists of four terms: 



(we scaled some terms by positive factors to keep the leading coefficient equal to 
1). One immediately sees that the values of a for which a = 4 /5 5 is "dangerous" , 
hardly a new fact for those familiar with Lecture 5. 

To fix ideas, assume that a 4 > 4 4 /5 5 . Let us determine the total number of 
real roots of f(x). The signs of the Sturm sequence at — oo and +00 are 



(6.3) 



Pk-i(x) = q{x)p k (x) - Pk+i(x). 



x 5 — x + a, 




(-,+,-,+) and (+,+,+,+), 
hence there are 3 roots (if a 4 < 4 4 /5 5 this number is equal to 1). 



4 This fact is proved in Section 8.2. 
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What about the number of positive roots? To answer this question, we need 
to evaluate the polynomials of the Sturm sequence at x = 0. We get: 



a < 0. In the former, we have the signs (+,—,—,+), and in the latter (—,—,+,+). 

Comparing this to the signs at +oo (all pluses), we conclude that if a > there 
are 2 positive roots, and if a < there is a single one (this also holds for a = 0). 

6.4 Fundamental Theorem of Algebra. We feel that this lecture would 
not be complete without a discussion of the Fundamental Theorem of Algebra. 5 
The reader probably knows what it states: every complex polynomial of degree n 
has exactly n complex roots (counted with multiplicities). In fact, it suffices to 
show that there exists one: if b is a root of f(x) then x — b divides f(x). Then the 
quotient also has a root, etc., until all n roots are found. 

It has been noted that nearly every branch of mathematics tests its techniques 
and demonstrates its maturity by providing a proof of the Fundamental Theorem of 
Algebra. We shall give a proof that makes use of the notion of the rotation number 

of a closed curve about a point. 6 Given a polynomial f(x) — x n + a^x 71 ^ 1 H h a n 

with complex coefficients, assume that is not its value for any complex x. We 
view / as a continuous map from the complex plane to itself. 

Consider the circle of radius t about the origin and let 74 be the image of this 
circle under the map /. Then j t is a closed curve that does not pass through the 
origin. Let r{t) be the rotation number (total number of turns) of this curve about 
the origin. As t varies from very small to very big values, the number r(t) does not 
change: indeed, this is an integer, that continuously depends on t, and therefore 
constant. 

Let us compute r(t) for t very small. The constant term a n of f(x) is non- 
zero, otherwise /(0) = 0. If £ is small enough then the curve j t lies in a small 
neighborhood of the point a n and does not go around the origin at all, see Figure 
6.4. Thus r(t) = 0. 



a 




Still assuming that the last number is positive, there are two cases: a > and 



.0 




Figure 6.4. For t small, the rotation number is zero 



5 Its first proof was published by d'Alembert in 1746. 

6 See Lecture 12 for a detailed discussion. 
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What about r(t) for t very large? Let us write 

/(z)=*"(l+| + | + - + 
Now we deform this formula: 

(6.4, + + + 

where s varies from 1 to 0. Let -f t .s be the image of the circle of radius t under the 
map f s . 

If \x\ is sufficiently large then the complex number inside the inner parentheses 
in (6.4) is small, in particular, its absolute value is less than 1. Indeed, let \x\ l > 
n\ai\ for alH — 1, . . . , n. Then 



Ol 

x 



02 



Or. 
X Tl 



< 



\a r , 
\x' 



< n ■ - = 1. 

n 



It follows that the curves j t ,s do not pass through the origin for all s. Hence they 
all have the same rotation numbers about the origin. This rotation number is 
especially easy to find for s = 0: since /o(x) = x n , the curve 7 ti o is a circle of radius 
t n , making n turns about the origin and hence having the rotation number n. 

It follows that r(t) = n for t large enough, while r(t) = for small values of t. 
This is a contradiction, which proves that f(x) has a root. 

In conclusion, let us outline a different argument, very much in the spirit of 
Lectures 8 and 5. 

We considered in these lectures the space of polynomials of a certain type (such 
as x 3 + px + q or x 5 — x + a) and saw that the set of polynomials with multiple 
roots separated the whole space into pieces, corresponding to the number of roots 
of a polynomial. The set of polynomials with multiple roots is a (very singular) 
hypersurface obtained by equating the discriminant of a polynomial to zero. 

Unlike the real case, the set of zeros of a complex equation does not separate 
complex space. This is particularly obvious in dimension one: a finite set of points 
partitions the real line into a number of segments and two rays, but docs not 
separate the complex plane, so that every two points can be connected by a path 
avoiding this set. 

One starts with a polynomial fo(x) of degree n that manifestly has n roots, 
say, (x — l)(x — 2) • • • (x — n). Any other polynomial, f(x), without multiple roots 
can be connected with fo(x) by a path in the space of polynomials without multiple 
roots. When one "moves" f to /, the roots also move, but never collide, until they 
become the roots of f(x) (this process is described in detail in Lecture 5). 
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6.5 Exercises. 

6.1. A polynomial of degree n is called hyperbolic if it has n distinct real roots. 
Prove that if f(x) is a hyperbolic polynomial then so is its derivative f'(x). 

6.2. Prove that between two roots of a polynomial f(x) there is a root of the 
polynomial f'(x) + af{x) where a is an arbitrary real. 

6.3. Let f(x) be a polynomial without multiple roots. Consider the curve in 
the plane given by the equations x — f(t), y = f'(t). 

(a) This curve does not pass through the origin. 

(b) The curve intersects the positive j/-semiaxis from left to right, and the 
negative one from right to left. 

(c) Conclude that between two roots of / there lies a root of /' (Rolle's theo- 
rem). 

6.4. (a) If a graph y = f(x) intersects a line in three distinct points then there 
is an inflection point of the graph between the two outermost intersections. 

(b) If a function coincides at n + 1 points with a polynomial of degree n — 1 
then its nth derivative has a root. 



6.5. The polynomial 



x x 2 x n 
1 + TT + 2! + -" + ^ 



has either no roots or one root according as n is even or odd. 

Remark. In contrast, as n — > oo, the complex roots of this polynomial have an 
interesting distribution. More precisely, in the limit n — > oo, the complex roots of 
the polynomial 

nx (nx) 2 (nx) n 
1-1 1_ 1 — L. -| 1- ^ — L_ 

1! 2! n\ 
tend to the curve \ze 1 ^ z \ = 1, a theorem by G. Szego [77]. 

6.6. Prove that the number of positive roots of a polynomial has the same 
parity as the number of sign changes in the sequence of its coefficients. 

6.7. Prove the Descartes rule for the function 

f(x) = ai e XlX + a 2 e X2X + ■■■ + a n e X " x : 

if Xi < A 2 < • • • < X n then the number of roots of the equation f(x) = does not 
exceed the number of sign changes in the sequence a\, . . . , a n . 

6.8. Compute the Sturm sequence and determine the number of roots of the 
polynomial x 3 — 3x + 1 on the segments [—3, 0] and [0, 3]. 

6.9. * The following result is known as the Fourier-Budan theorem. 

Let f(x) be a polynomial of degree n. Let S(x) be the number of sign changes in 
the sequence f(x),f'(x), f"(x), . . . , f^ n \x). Then the number of roots of / between 
a and b, where /(a) ^ 0, f(b) ^ and a < b, is not greater than and has the same 
parity as S(a) — S(b). 

6.10. The following approach to the Fundamental Theorem of Algebra is due 
to Gauss. 

Let f(x) be a generic complex polynomial of degree n. Consider the two curves, 
7i and 72, given by the conditions that the real and the imaginary parts of f(x) 
equal zero. One wants to prove that 71 and 72 intersect. 
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Prove that each curve 71 and 72 intersects the boundary C of a sufficiently 
large disc D at exactly 2n points, and the points 71 n C alternate with the points 
72 n C. Conclude that 71 n D and 72 n D consist of n components each, and that 
every component of 71 n D crosses some component of 72 n D. 




LECTURE 7 
Chebyshev Polynomials 

7.1 The problem. The topic of this lecture is a very elegant problem on 
polynomials that goes back to P. Chebyshev, an outstanding Russian 19-th century 
mathematician (cf. Lecture 18). 

Fix a segment of the real axis, say, [—2, 2] (the formulas for this segment are the 
simplest possible; see Exercise 7.2 for the general case). Given a monic polynomial 
of degree n 

(7.1) P n {x) =x n + a 1 x n - 1 + --- + a n , 

let M and m be its maximum and minimum on the segment [—2, 2]. The deviation 
of P n (x) from zero is the greatest of the numbers \M\ and \m\. If the deviation 
from zero is c > then the graph of the polynomial is contained in the strip |y| < c 
and is not contained in any narrower strip symmetric with respect to the x-axis. 

The problem is to find the monic polynomial of degree n whose deviation from 
zero is as small as possible, and to find the value of this smallest deviation. 

7.2 Small degrees. Let us experiment with polynomials of small degrees. 

Example 7.1. If Pi (x) = x + a then M = a + 2,m = a- 2. If c is the deviation 
from zero then \a + 2| < c, \a — 2| < c. By the triangle inequality 

2c > \a + 2| + \a - 2| > \(a + 2) - (a - 2)| = 4, 

and hence c > 2. This deviation is attained by the polynomial P\{x) = x. 

Example 7.2. Let us consider polynomials of degree two, P2(x) = x 2 +px + q. 
A little reflection suggests that the optimal position of the graph of a quadratic 
polynomial is the most symmetric one, as in Figure 7.1. This figure depicts the 
graph of the polynomial x 2 — 2 whose deviation from zero is 2. 

Let us prove that this is indeed the answer for polynomials of degree 2. If c is 
the deviation from zero of P2(x) then the moduli of its values at the end points ±2 
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Figure 7.1. The "best" quadratic parabola 
and point do not exceed c: 

c > \A-2p + q\, 
c > \A + 2p + q\, 
c > \q\. 

Using the triangle inequality again yields: 

4c > \A-2p + q\ + \A + 2p+q\ + 2\q\ > | (4 - 2p + q) + (4 + 2p + q) - 2q\ = 8, 
and hence c > 2. 

Example 7.3. Let us try our luck once again, for polynomials of degree three, 
Ps(x) = x 3 +px 2 + qx + r. If c is the deviation from zero of ^3(2;) then the moduli 
of its values at the end points ±2 and points ±1 do not exceed c: 

c > I - 8 + 4p-2g + r|, 

c > |8 + 4p + 2<7 + r|, 

c > I - 1 +p — q + r\, 

c > \l+p + q + r\. 

The triangle inequality yields: 

2c > I - 8 + Ap - 2q + r\ + |8 + 4p + 2q + r\ > |16 + 4g|, 
4c > 2| - l+p-q + r\ + 2\l+p + q + r\ > |4 + 4g|, 

and applying the triangle inequality again, 

6c > 116 + 4(71 + 14 + 4(71 > | (16 + Aq) - (4 + Aq)\ = 12, 

which implies that c > 2. An example of a cubic polynomial with deviation from 
zero 2 is x 3 — 3x. 

An adventurous reader may try to consider polynomials of degree 4 but this 
is not a very inviting task. One may conjecture that the least deviation from zero 
will be always 2, but to prove this, one needs more than just "brute force". 
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7.3 Solution. Suppose that, for some c > 0, we found a polynomial P n (x) 
of degree n such that its graph (over the segment [—2,2]) lies in the strip \y\ < c 
and contains n + 1 points of its horizontal boundaries: the right-most on the upper 
boundary y = c, the next left on the lower boundary y = —c, the next left on y = c, 
etc. Thus the graph snakes between the lines y = ±c, touching them alternatively 
n + 1 times. 




Figure 7.2. The graph of an optimal polynomial 



Theorem 7.1. The deviation from zero of any monic polynomial of degree n 
is not less than c, and P n (x) is the unique monic polynomial of degree n with the 
deviation from zero equal to c. 

Proof. Let Q n (x) be another monic polynomial of degree n whose deviation 
from zero is less than or equal to c. Then its graph also lies in the strip \y\ < c. 

Let us partition this strip into n + 1 rectangles by the vertical lines through 
the maxima and minima of P n (x), see Figure 7.2. The graph of P n (x) connects the 
diagonally opposite vertices of each rectangle, therefore the graph of Q n (x) interests 
that of P n (x) in each rectangle (see Figure 7.3). There are n such rectangles, and 
therefore the equation P n (x) — Q n (x) = has at least n roots. But P n (x) — Q n (x) 
is a polynomial of degree n— 1, which has at most n—1 roots, unless it is identically 
zero. Conclusion: P n (x) = Q n (x), and we are done. □ 

We cannot claim yet "mission accomplished" because we still do not have the 
polynomials P n (x) satisfying the conditions of Theorem 7.3. 

Lemma 7.4. There exists a monic polynomial P n (x) of degree n such that 

(7.2) 2cosna = P„(2cosa). 
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Figure 7.3. Proof of Theorem 7.1 



For example, 

2cos2a = 4cos 2 a-2 = (2cos 2 a) 2 -2, thus P 2 (x) = x 2 - 2, 

2 cos 3a = 8 cos 3 a — 6 cos a = (2 cos 2 a) 3 — 3(2 cos a), thus P 3 (x) = x 3 — 3x. 

Proof, (cf. Proposition 2.4) Induction on n. Assume that (7.2) holds for n — 1 
and n. Then 

cos(n + l)a + cos(n — l)a = 2 cos a cos na, 

and hence 

2cos(n + l)a = 4cosacosna — 2cos(n — l)a — (2 cos a) (2 cos na) — 2cos(n — l)a 
= (2cosa)P„(2cosa) — P„_i(2cosa). 

Therefore 

(7.3) P„+i(x) = xP n (x) - P n ^(x). 

This recurrence relation defines the desired sequence of monic polynomials P n of 
degree n. □ 

The polynomials P n (x) are what we need. Indeed, let a vary over [0,tt]. Then 
na varies over [0, wr], and the functions x — 2 cos a and P n (x) = 2 cos na range 
over the segment [—2,2]. Moreover, x covers this segment exactly once, while 
P n {x) covers it n times, assuming alternating values ±2 for x = arccos(fc7r/n), k = 
0, . . . , n. This means that the graph of the polynomial P n (x) lies in the strip \y\ < 2 
and contains n + 1 points of its alternating boundaries. 

Let us summarize: there is a unique monic polynomial of degree n, given by 
(7.2), whose deviation from zero on segment [—2,2] is 2, and the deviation from 
zero of any other monic polynomial of degree n is greater. 

The polynomials P n (x) are called Chebyshev polynomials ; see Figure 7.4 for 
the graphs of the first Chebyshev polynomials. 
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y = P 2 (x) 



y = Ps(x) 



y = Pa{* 




y = Prix) 



Figure 7.4. The graphs of the first seven Chebyshev polynomials 



7.4 Formulas. The Chebyshev polynomials 

P (x) = 2, P 1 (x)=x, P 2 (x) = x 2 -2, P 3 (x) = x 3 -3x, P 4 (x) = x* - Ax 2 + 2, . . . 

can be described by a number of explicit formulas. For example, consider the 
continued fraction: 

X - . 



X — 



Thus 

and in general, 
Lemma 7.5. 



i?i = x, Ro 



x 2 -2 



x — 3x 
x 2 -2' 



R n (x) 



Pn-l(x)- 



Proof. Induction on n. One has: 



R n+ i = x - — = x 

tin 



Pn-l(x) _ xP n (x) - P„-l(x) _ P n+ l(x) 



Pu(x) 



Pn{x) 



Pn(x) ' 
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the last equality due to the recurrence relation (7.3). □ 

Here is another formula which describes the coefficients of the Chebyshev poly- 
nomials via binomial coefficients. 

Theorem 7.2. 

rw n r, n f n ^^\ 7,-1 n fn — 2\ „_ 4 

p n ( x ) = x n -——^ i y 2 +— 2 ( 2 j^ 4 +... 



+ (-l) J ; . J )x n -^ + ... 

(with j < n/2). 

Proof. One can encode Chebyshev polynomials in the generating function 

$0) = 2 + xz + (x 2 - 2)z 2 + ■■■ + P n (x)z n + .... 

Using the recurrence relation (7.3), one can replace each P n (x), starting with n = 2, 
by a combination of P n _i(x) and P n - 2 (x): 

= 2 + xz+ (xP^x) - P (x))z 2 + (xP 2 (x) -P 1 (x))z 3 + (xP 3 (x) - P 2 (x))z 4 + . . . 

= 2 + xz + xz(P 1 (x)z + P 2 (x)z 2 + P 3 (x)z 3 ...)~z 2 (P (x) + P 1 (x)z + P 2 (x)z 2 + ...) 

= 2 + xz + xz($(z) - 2) - z 2 $(z) = 2 - xz + (xz - z 2 )<$>(z). 

Hence 



2-xz 



1 — xz + z 2 ' 

This formula contains all the information about Chebyshev polynomials; it remains 
only to extract the information from there. The key is the formula for geometric 
progression: 

! 1 + q + q 2 + q 3 + . . . . 



l-q 
Thus 

(7.4) $(z) = (2 - xz)[l + (xz - z 2 ) + [xz - z 2 ) 2 + [xz - z 2 f + . . .] 
One has: 

(xz - z 2 ) k = x k z k - kx k - l z k+x + ■■■ + (-IY (^jx k - j z k+j + .... 

Collecting terms on the right-hand side of (7.4) yields the following coefficient in 
front of z n : 

- j\ (n - j - 1 N 



x n + ■ ■ ■ + (-iy 



x n - 2 3 + .... 



3 

It remains to simplify the expression in the brackets: 

'n — j\ (n — j — 1\ 2(n — j)\ (n — j — 1)\ 



j ) V 3 ) 3\n-2j)\ j\(n-2j-l)\ 
(n — j — 1) ! n ( n — j s 



= n- 



j\(n-2j)\ n-j\ j /' 
and the result follows. □ 
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John Smith 
January 23, 2010 



Martyn Green 
August 2, 1936 



Henry Williams 
June 6, 1944 



7.5 Exercises. 

7.1. There is a shortcoming in the proof of Theorem 7.1: the graphs of the two 
polynomials may be tangent at the intersection point, see Figure 7.5. Adjust the 
argument to this case. 



y 











\ c 




y = P n (x)/ 




-2 








/ /y = Q{x) 
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— c 





Figure 7.5. A difficulty with tangency 



7.2. Prove that the least deviation from zero of a monic polynomial on a seg- 
ment [a, b] equals 




7.3. Find the least deviation from the function y = e x of a linear function 
y = ax + b on the segment [0,1]. 

7.4. Prove yet another formula for Chebyshev polynomials: 

D . , (x + V^4)™ + {x- V^ - 4)« 
P »( x ) = ^ 

(we assume here that |x| > 2). 



106 



LECTURE 7. CHEBYSHEV POLYNOMIALS 



7.5. Prove that 
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1 X 



P n (x) = dct 



7.6. Prove that the polynomials are orthogonal in the following sense: 

' 2 P m (x)P n (x) 



/: 



-dx = 



x' 



for all to 7^ n. 

7.7. Prove that Chebyshev polynomials commute: 

Pn(Pm(x)) = P m (P n (x)). 

7.8. * Consider a family of pairwise commuting polynomials with positive lead- 
ing coefficients and containing at least one polynomial of each positive degree. Prove 
that, up to linear substitution, it is the family of Chebyshev polynomials or the 
family x n . 

The next three exercises provide an alternative proof of 7.1. 

7.9. Consider a trigonometric polynomial of degree n 

(7.5) f(a) = a + ai cos a + a-i cos 2a + ■ ■ ■ + a n cos na. 

Prove that its "average" value 
1 

2n ' ' \n 



W)-f(-)+f(- 



+ 



(2n- l)?r 



is equal to a n . 



7.10. Prove that the deviation from zero of the trigonometric polynomial (7.5) 
on the circle [0, 2tt] is not less than \a n \. 

Hint. The deviation from zero is not less than 



1 

2n 



1/(0)1 



/(- 



(2n- 1)tt 



Use Exercise 7.9 and the triangle inequality |a| + |6|>|a + 6|. 
7.11. Deduce Theorem 7.1 from Exercises 7.9 and 7.10. 

Hint. After the substitution x = cos a, the polynomial P n (x) becomes a 
trigonometric polynomial (7.5) with the leading coefficient 1/2™ -1 . As a ranges 
over the circle, x traverses the segment [—1,1]. 




LECTURE 8 

Geometry of Equations 

8.1 The equation x 2 + px + q = 0. Looking at the expression in the title of 
this section, we see a quadratic equation in the variable x whose coefficients are the 
parameters p and q. This is a matter of one's perspective: equally well one may 
view this expression as a linear equation in the variables p and q with coefficients 
depending on the parameter x. A linear equation q = —xp — x 2 describes a non- 
vertical line; thus one has a 1-parameter family of lines in the (p, g)-plane, one for 
each x. 




Figure 8.1. The envelope of the family of lines q = —xp — x 2 
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Let us draw a few lines from this family, sec Figure 8.1. These lines are tangent 
to a curve that looks like a parabola. This envelope is the locus of intersection points 
of pairs of infinitesimally close lines from our family (the reader will find explicit 
formulas in Section 8.3). That our envelope is indeed a parabola, will become clear 
shortly. Each line in Figure 8.1 corresponds to a specific value of x. Let us write 
this value of x at the tangency point of the respective line with the envelope. This 
makes the envelope into a measuring tape, like the x-axis, but bent - see Figure 
8.2. 




Figure 8.2. A bent "measure tape" 



The curve in Figure 8.2 can be used to solve the equation x 2 + px + q = 
graphically. Given a point in the (p, q)-planc, draw a tangent line from this point 
to the envelope. Then the x-value of the tangency point is a root of the equation 
x 2 +px + q = 0, see Figure 8.3. In particular, the number of roots is the number of 
tangent lines to the envelope from a point (p, q). For the points below the envelope 
there are two tangent lines, and for the points above it - none. 

What about the points on the envelope? For them, there is a unique tangent 
line to the envelope, that is, the two roots of the quadratic equation coincide. Thus 
the envelope is the locus of points (p, q) for which the equation x 2 + px + q = has 
a multiple root. This happens when p 2 = Aq, that is, the envelope is the parabola 
q = p 2 /A. 

8.2 The equation x 3 + px + q = 0. A simple quadratic equation probably 
does not deserve this relatively complicated treatment. Let us now consider the 
more interesting cubic equation x 3 + px + q = 0. Although this equation still can 
be solved explicitly in radicals (see Lecture 4), these formulas are not so simple 
and, in some situations, not very useful. Instead, let us treat this equation as a 
1-paramctcr family of lines in the (p, g)-planc. 

Figure 8.4 depicts several lines and features their envelope with a scale on 
it. This envelope is a cusp and it strongly resembles the semicubic parabola from 
Lecture 9. We shall find its equation in a minute. 
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Figure 8.3. A machine for solving quadratic equations 




Figure 8.4. The envelope of the family of lines x 3 + px + q = 
and a bent "measure tape" 

As before, the curve in Figure 8.4 is a device for solving the equation x 3 +px + 
q = graphically: draw a tangent line to the envelope from a point (p, q) and read 
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off a root as the x-coordinate of the tangency point. For points inside the cusp, 
there are three tangent lines, and for points outside it - only one, see Figure 8.5. 
The curve itself is the locus of points (p, q) for which the equation has a multiple 
root, and the vertex of the cusp, the origin, corresponds to the equation x 3 = 
whose roots are all equal. 




FIGURE 8.5. The number of roots of a cubic equation 



To find an equation of the curve in Figures 8.4 and 8.5, one needs to know 
when the equation x 3 + px + q = has a multiple root. A general criterion is as 
follows. 

Lemma 8.1. A polynomial f{x) has a multiple root if and only if it has a 
common root with its derivative f'(x). 

Proof. If a is a multiple root of f(x) then f(x) = (x — a) 2 g(x) where g(x) is 
also a polynomial. Then f'(x) = 2(x — a)q(x) + (x — a) 2 q'(x), hence a is a root of 
fix). 

Conversely, let a be a common root of / and /'. Then f(x) — (x — a)g(x) for 
some polynomial g, and hence f'(x) = g(x) + (x — a)g'(x). Since a is a root of /', 
it is also a root of g. Thus g{x) = (x — a)h{x) for some polynomial h, and therefore 
f(x) = (x — a) 2 h(x). It follows that a is a multiple root of the polynomial /. □ 
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In our situation, f(x) = x 3 +px + q and f'(x) = 3x 2 + p. If a is a common root 
of these polynomials then p = —3a 2 and q = —a 3 — pa = 2a 3 . These are parametric 
equations of the envelope in Figures 8.4 and 8.5. One may eliminate a from these 
equations, and the result is 

Ap 3 + 27a 2 = 0. 

This is a semicubic parabola. 

The expression D = — (4p 3 + 27a 2 ) is the discriminant of the polynomial x 3 + 
px+q; its sign determines the number of roots: if D > there are three, and if D < 
one real root. By the way, there is no loss of generality in considering polynomials 
x 3 + px + q with zero second highest coefficient: one can always eliminate this 
coefficient by a substitution x ^> x + c (this is discussed in detail in Lecture 4). 

8.3 Equation of the envelope. An equation for the envelope of a 1-parameter 
family of curves, in particular, lines, is easy to find. Let us do this in the case of 
our cubic polynomial f(x) = x 3 + px + q. 

A point of the envelope is a point of intersection of the line x 3 + px + q = 
and the infinitesimally close line (x + e) 3 + p(x + e) + q = 0. The second equation 
can be written as 

(x 3 +px + q)+ e(3x 2 +p) + 0(e 2 ) = 

where, following the common calculus notation, 0{e 2 ) denotes the terms of order 
2 and higher in s. Since e is an infinitesimal, we ignore its powers starting with 
e 2 , and the system of equations becomes f(x) — 0, f(x) + ef'(x) = 0, which is of 
course equivalent to 

f(x) = f'(x) = 0. 

This is a parametric equation of the envelope (x being the parameter), and this 
holds true for any 1-parameter family of curves in the (p, q)-plane given by the 
equation f(x,p,q) = 0. In view of Lemma 8.1, we see again that the envelope 
corresponds to the points (p, q) for which the cubic polynomials x 3 + px + q has a 
multiple root. 

8.4 Dual curves. Consider the equation 
(8.1) l + kp + q = 0. 

We are free to consider (8.1) as a linear equation in the variables p,q depending 
on k, I as parameters, or as a linear equation in the variables k, I depending on p, q 
as parameters. Thus every non-vertical straight line in the (p, g)-plane corresponds 
via (8.1) to a point of the (k, Z)-plane, and vice versa. We have two planes, and 
points of one are the non-vertical lines of the other. These two planes are said to 
be dual to each other. 

Let us use the following convention: points are denoted by upper-case letters 
and lines by lower-case ones. Given a point of one of the planes, denote the cor- 
responding line of the dual plane by the same lower-case letter. We shall think of 
the (k, Z)-plane as positioned on the left and the (p, g)-plane on the right. 

The first observation is that the incidence relation is preserved by this duality: 
if A e I then L e a. Indeed, let A and I lie in the left plane, A = (k,l), and 
L = (p, q) be the point dual to I. Then equation (8.1) says that I passes through A 
but, by the same token, that a passes through L. 

For example, a triangle is a figure made of three points and three lines. Duality 
interchanges points and lines but preserves their incidence, and the dual figure is 
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again a triangle. Likewise, the figure dual to a quadrilateral with its two diagonals 
consists of four lines and their six pairwise intersection points, see Figure 8.6. 



Duality extends to smooth curves. Let 7 be a curve in the left plane. The 
tangent lines to 7 correspond to points of the right plane (we assume that 7 has no 
vertical tangents). One obtains a 1-parameter family of points of the right plane, 
that is, a curve. This curve is said to be dual to 7 and we denote it by 7*. The 
following example will clarify this construction. 

Example 8.2. Let 7 be a "parabola" of degree a given by the equation I — k a . 
The tangent line at the point (t,t a ) has the equation I — at a ~ 1 k + (a — l)t a = 0, 
that is, I + kp + q = with p = —ott"^ 1 , q = (a — l)t a . This is a parametric 
equation of another parabola of degree /3 = a/(a — 1), and hence 7* is a parabola 
of degree (5. A more symmetric way to write the relation between a and (3 is 



In particular, if a = 2 then (3 = 2, and if a = 3 then j3 = 3/2. 

Let us discuss how duality changes the shape of a curve. If 7 has a double 
tangent line then the dual curve 7* has a double point, see Figure 8.7. 



Suppose now that 7 has an inflection point as in Figure 8.8. An inflection point 
is where a curve is abnormally well approximated by a line (at a generic point, one 
has first order tangency but at an inflection point the tangency is of at least second 
order). This implies that the dual curve is abnormally close to a point, that is, has 
a singularity, see Figure 8.8. This qualitative argument is confirmed by Example 
8.2: if 7 is a cubic parabola then 7* is a semicubic cusp. 




Figure 8.6. Dual figures 




Figure 8.7. A double tangent is dual to a double point 
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Figure 8.8. An inflection is dual to a cusp 

Due to the symmetry between dual planes, one would expect duality to be a 
reflexive relation, that is, if 7 is dual to S then 5 is dual to 7. This is indeed the 
case. 

Theorem 8.1. The curve (7*)* coincides with 7. 



i 




Figure 8.9. Proof of Theorem 8.1 

Proof. Start with the curve 7* . To construct its dual curve one needs to consider 
its tangent lines. Consider instead a secant line I that intersects 7* at two close 
points A and B. The dual picture consists of two close tangent lines a and b to 7 
intersecting at the point L, dual to the line I - see Figure 8.9. In the limit, as the 
points A and B tend to each other, the line I becomes tangent to 7* and the point 
L "falls" on 7. Thus (7*)* = 7. □ 

This duality theorem makes it possible to read Figures 8.7 and 8.8 backwards: 
if a curve has a double point then its dual has a double tangent, and if a curve has 
a cusp then its dual has an inflection. For example, the curves in Figure 8.10 are 
dual to each other. 

Let us relate this to what we did in Sections 8.1 and 8.2. For example, the 
equation x 3 +px + q = is obtained from the equation l + kp+q = if k = x, I = x 3 . 
The latter equations describe a cubic parabola in the (k, Z)-plane. The dual curve 
in the (p, g)-plane is the envelope of the 1-parameter family of lines x 3 +px + q = 0, 
a semicubic parabola, as we discovered in Section 8.2. 



114 



LECTURE 8. GEOMETRY OF EQUATIONS 





Figure 8.10. A pair of dual curves 



We can also explain why the curves in Figures 8.2 and 8.4 do not have inflec- 
tions. The 1-parameter families of lines x 2 + px + q = and x 3 + px + q = 
determine smooth curves in the (k, Z)-plane (for example, k — x, I — x 2 , in the first 
case). Therefore their dual curves in Figures 8.2 and 8.4 are free from inflections. 

8.5 The projective plane. The self-imposed restriction not to consider ver- 
tical lines is somewhat embarrassing. Can one extend duality between points and 
lines to all lines? The way to go is to extend the plane to the projective plane. 

The projective plane is defined as the set of lines in Euclidean three dimensional 
space passing through the origin. Since every line intersects the unit sphere at two 
antipodal points, the projective plane is the result of identifying pairs of antipodal 
points of the sphere. The (real) projective plane is denoted by RP 2 . 

Given a line in / through the origin, that is, a point of the projective space, 
choose a vector (it, v,w) along I. This vector is not unique, it is defined up to 
multiplication by a non-zero number. The coordinates (u, v, w), defined up to a 
factor, are called homogeneous coordinates of the point I. 

By definition, a line in the projective plane consists of the lines in space that lie 
in one plane. Choose a plane tt in space not through the origin (a screen). Assign to 
every line its intersection point with this screen. Of course, some lines are parallel 
to the screen and we shall temporarily ignore them. This provides an identification 
of a part of the projective space with the plane tt, and the lines in the projective 
plane identify with the lines in tt. In other words, the plane tt can be considered 
as a (large) part of the projective plane, called an affine chart. Another choice of a 
screen would give another affine chart. If tt is given by the equation z = 1 (where 
x,y,z are Cartesian coordinates) then one may choose homogeneous coordinates 
in the form (u, v, 1) and drop the last component to obtain the usual Cartesian 
coordinates in the plane tt. 

Which part of the projective plane does not fit into an affine chart? These 
are the lines in space that are parallel to tt. If a line makes a small angle with tt 
then its intersection point is located far away, and if this angle tends to zero, the 
respective point escapes to infinity. Thus the projective plane is obtained from the 
usual plane tt by adding "points at infinity" . These points form a line, "the line at 
infinity". The homogeneous coordinates of these points are (u, v,0). 
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Given a line I through the origin, consider the orthogonal plane (3 passing 
through the origin. This defines a correspondence between points and lines of the 
projective plane, the projective duality. Let (u, v, w) be homogeneous coordinates 
of I, and let the plane f3 be given by the linear equation ax + by + cz = 0. The 
condition that I is orthogonal to (3 is 

(8.2) au + bv + cw = 0. 

Assume that w ^ and a^O. Then one may rescale the homogeneous coordinates 
so that w = a = 1, and equation (8.2) can be written as u + bv + c = 0. This differs 
from equation (8.1) only by the names of the variables. The condition w ^ means 
that we are restricted to an affine chart and the condition a^O that we consider 
non- vertical lines. 

8.6 The equation x 4 +px 2 + qx + r = 0. This equation defines a 1-parameter 
family of planes in Euclidean 3-space with coordinates p, q, r. The envelope of this 
family is a surface depicted in Figure 8.11. 

Consider two planes from our family corresponding to very close values of the 
parameter, x and x + e. These two planes intersect along a line, and this line has 
a limiting position as e — > 0. One has a 1-parameter family of lines, l(x), and they 
all lie on the surface. Thus this surface is ruled. 

Now take three planes from our family, corresponding to the parameter values 
x — e, x and x + e. These three planes intersect at a point, and again this point 
has a limiting position as e — > 0. This point, P(x), is also the intersection point 
of infinitcsimally close lines l(x) and l(x + e). Thus the lines l(x) are tangent to 
the space curve P(x) which they envelop. Note that, in general, a 1-parameter 
family of lines in space does not envelop a curve: two infinitesimally close lines will 
be skew; our situation is quite special! To summarize: the surface in Figure 8.11 
consists of lines, tangent to a space curve. 

In fact, it is easy to write down equations for this curve. Let f(x) = x 4 +px 2 + 
qx + r. Arguing as in Section 8.3, a parametric equation of the curve P(x) is given 
by the system of equations f(x) = f'(x) = f"(x) = 0, that is, 



a parametric equation of a spacial curve. This curve itself has a cusp at the origin. 
In Figure 8.11, this curve cosists of two smooth segments, BA and AC; A represents 
the origin. Besides the curviliunear triangle BAC, the surface has two "wings," 
ABGF and ACHE, attached to the segments BA and AC and crossing each other 
along the curve AD. Figure 8.11 shows the surface as it was presented in the 
algebraic works of the 19-th century. In geometric works of the second half of 
the 20-th century, this surface reappeared under the name of swallow tail. We 
shall consider this surface in the context of Paper Sheet Geometry, and a (more 
comprehensible) picture of this surface will be presented in Lecture 13 (see Figure 



x 4 + px 2 + qx + r = 0, 4x 3 + 2px + q 



0, 12x 2 + 2p = 0. 



Hence 



p = —6x 2 , q = 8x 3 , r 




13.17). 



Figure 8.12 depicts the curve 



p=-6x 2 , q = 8x 3 , r=-3x 4 
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Figure 8.11. The envelope of the family of planes x 4 +px 2 + qx + 
r = 

separately from the surface. The side diagrams of Figure 8.12 show the projections 
of this curve onto the pq-, pr-, and gr-planes. Notice that in the projection onto 
the pr-plane, the curve folds into a half of a parabola. 




Figure 8.12. The curve itself has a cusp 
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The surface in Figure 8.11 is given by the system of equations f(x) = f'(x) = 0, 
that is, 

a; 4 + px 2 + qx + r = 0, 4a; 3 + 2px + q = 0. 

If one eliminates x from these equations, one obtains one relation on p,q,r, and 
this equation will be our goal in the next section. 

Similarly to Sections 8.1 and 8.2, the number of roots of the equation x A + 
px 2 + qx + r = equals the number of tangent planes to our surface from the 
point (p, q, r). This number changes by 2 when the point crosses the surface. The 
set complement of the surface consists of three pieces, corresponding to 4, 2 and 
roots. 

8.7 A formula for the discriminant. The discriminant of a polynomial 
f{x) of degree n is a polynomial D in the coefficients of f(x) such that D = if 
and only if f(x) has a multiple root. Namely, 

D = ni<i <J <„(a;i - Xj) 2 

where x\ 1 are the roots. Recall that the coefficients of a polynomial are the 

elementary symmetric functions of its roots, see Lecture 4. The discriminant D 
is also a symmetric function of the roots, and therefore it can be expressed as a 
polynomial in the coefficients of f(x). The actual computation is a tedious task 
but it is doable "barehanded" for polynomials of small degrees. 

Example 8.3. Let us make this computation for f(x) = x 3 +px + q (we know 
the answer from Section 8.2). One has: 

xi+x 2 + x 3 = 0, 
XlX 2 + x 2 x 3 + x 3 xi =p, 
X\X 2 x 3 = -q. 

Thus p has degree 2 as a polynomial in the roots, and q has degree 3. The discrim- 
inant has degree 6, and there are only two monomials in p and q of this degree: p 3 
and q 2 . Thus D = ap 3 + bq 2 with unknown coefficients a and b. 
To find these coefficients, let 

xi = x 2 = t, x 3 = —2t. 

Then 

D = and p= -3t 2 , q = 2t 3 . 
This implies that 27a = 46. Next, let x\ = —x 2 = t,x 3 = 0. Then 

D = 4t 6 and p = -t 2 , q = 0. 

It follows that a = -4, b = -27 and D = -Ap 3 - 27q 2 . 

A similar, but much more tedious, computation yields the rather formidable 
formula for the discriminant of the polynomial f(x) — x 4 + px 2 + qx + r: 

(8.3) D = 256r 3 - 128pV - 27q 4 - Ap 3 q 2 - 16p 4 r - 144pg 2 r. 

Another method of computing the discriminant D is as follows. We want to 
know when f{x) and /'(x) have a common root, say a. If this is the case then 
f(x) = (x — a)g(x), f'(x) = (x — a)h(x) where g and h are polynomials of degree 3 
and 2, respectively. It follows that 

(8.4) f(x)h(x) - f(x)g(x) = 0. 
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One may consider the coefficients of h and g as unknowns (there are 7 of them) , and 
then equating all the coefficients in (8.4) yields a system of seven linear equations in 
these variables. This system has a non-trivial solution if and only if its determinant 
vanishes. The reader will easily check that this determinant is 
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and equating it to zero yields another form of the equation of the surface in Figure 
8.11. This might be a nicer way to present the answer but it is still a separate and 
not-so-pleasant task to check that this is the same as (8.3); nowadays we might 
delegate this task to a computer. 

A final remark: the coefficients in formula (8.3) are rather large numbers, 
and one wonders whether these numbers have any combinatorial or geometrical 
meaning. A conceptual explanation of this and many other similar formulas is 
provided by the contemporary theory of discriminants and resultants, see the book 
by I. Gclfand, M. Kapranov and A. Zclcvinsky [33]; in particular, the coefficients 
in (8.3) are interpreted as volumes of certain convex polytopes. 

8.8 Exercises. 

8.1. (a) Consider the 1-parameter family of lines 
(8.6) psinx + qcosx = 1. 

Draw its envelope and use it to solve equation (8.6) geometrically for different values 
of (p,q). 

(b) Same problem for the equation \nx = px + q. 

8.2. Draw the curve dual to the graph y = e x . 

8.3. (a) The equation x 2 + y 2 = 1 describes a circle in the affine chart z = 1 of 
the projective plane. Draw this curve in the affine chart x = 1. 

(b) Same question for the equation y = 1/(1 + x 2 ). 

8.4. Draw the curves in the projective plane, dual to the curves from Exercise 

8.3. 

8.5. Take a unit disc and paste together every pair of antipodal points on its 
boundary. Prove that the resulting space is the projective plane. 

8.6. (a) Prove that the projective plane with a deleted disc is a Mobius band, 
(b) Prove that the set of all non-oriented lines in the plane (not necessarily 

passing through the origin) is a Mobius band. 

8.7. (a) Prove formula (8.3). 

(b) Prove that (8.3) is equal to (8.5). 
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LECTURE 9 

Cusps 

Among the graphs that calculus teachers love to assign their students, there are 
curves containing sharp turns, which mathematicians call cusps. A characteristic 
example given in Figure 9.1 below. This is a semicubic parabola, a curve given by 
the equation y 2 = x 3 . 




Figure 9.1. Semicubic parabola 

The next example is the famous cycloid (Figure 9.2). You will observe it if you 
make a colored spot on the tire of your bike and then ask your friend to ride the 
bike. The spot will trace the cycloid. 

Our last example is the so-called cardioid (Figure 9.3), a curve whose name 
reflects its resemblance to a drawing of a human heart. Mathematicians usually 
present this curve by the polar equation p = 1 + cos 0. 
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Figure 9.2. Cycloid 




Figure 9.3. Cardioid 

Certainly, the cusps on these graphs may seem something occasional, acciden- 
tal: there are so many curves without cusps. But do not come to a premature 
conclusion. Our goal is to convince you that cusps appear naturally in so many 
geometric or analytic contexts, that we can justly say: cusps are everywhere around! 

x 2 

Let us draw an ellipse, the one given by the equation — + y 2 = 1 and a suffi- 
ciently dense family of normals to the ellipse (a normal is a line perpendicular to 
the tangent at the point of tangency, see Figure 9.4). 




FIGURE 9.4. A tangent and a normal to an ellipse 
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A picture of the ellipse and 32 of its normals is shown in Figure 9.5. Although 
Figure 9.5 does not contain anything but an ellipse and 32 straight lines, we see 
one more curve on it: a diamond-shaped curve with four cusps. This phenomenon 
is not any special property of an ellipse. If we take a family of normals to a less 
symmetric egg-shaped curve, the diamond will also lose its perfect symmetry, but 
the cusps will still be there (see Figure 9.6). 




Figure 9.5. An ellipse with thirty two normals 

The curve with cusps is called the evolute of the given curve (to which we 
have taken the normals). It has a simple geometric, or, better to say, mechanical 
description. If a particle is moving along a curve, at every single moment its 
movement may be regarded as a rotation around a certain center. This center 
changes its position at every moment, thus it also traces a curve. It is this curve 
that we see on Figures 9.5 and 9.6. E volutes always have cusps. Moreover, the 
celebrated Four Vertex Theorem 1 (proved about 100 years ago, but still appearing 
mysterious) states that if the given curve is non-self-intersecting (like an ellipse or 
the egg-shaped curve of Figure 9.6), then the number of cusps on the evolute is at 
least four. 

For self-intersecting curves this is no longer true; the next picture (Figure 9.7) 
shows a family of normals to a self-intersecting curve; the evolute is clearly visible 
on this picture, and it has only two cusps. 

To be honest, this seemingly spontaneous appearance of a curve with cusps 
on a picture of a family of normals is not directly related to normals. You will 
see something very similar, if you take a "sufficiently arbitrary", or "sufficiently 



Sec Lecture 10. 
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Figure 9.7. A self-intersecting curve with normals 
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random" family of lines. Imagine an angry professor who throws his cane at his 
students. The cane flies and rotates in its flight. If you draw a family of subsequent 
positions of the cane in the air, you will see something like Figure 9.8. 




Figure 9.8. A flying cane (straight) 




Figure 9.9. A flying cane (curvy) 

You see here 32 subsequent positions of the cane, but also a curve looking a 
bit like the cycloid (Figure 9.2), with cusps (one of the cusps is clearly seen in the 
middle of the drawing). And straight lines do not play any special role, simply it 
is more convenient to draw them. If the professor is old and heavy, and his cane 
has long lost its linearity, then the picture of Figure 9.8 will look differently, but 
the cusps will remain (see Figure 9.9). 

But let us turn to another geometric construction where cusps arise in an even 
more unexpected way. Let us again begin with an ellipse. Imagine that all point of 
our ellipse simultaneously begin moving at a constant speed, the same for all points, 
and that every point moves inside the ellipse along the normal to the ellipse. At 
first, the ellipse shrinks, but still retains its smooth oval shape (Figure 9.10). 




Figure 9.10. First, the ellipse retains its oval shape 
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But then the points begin forming sort of crowds at the left hand and right 
hand extremities of the curve (Figure 9.11), then the trajectories of the points cross 
each other (no collisions, they pass through each other), and, believe it or not, the 
curve acquires four cusps (Figure 9.12). 




Figure 9.11. Then the points begin forming crowds 



:: ii 



Figure 9.12. Eventually, the curve acquires cusps 

The evolution of the moving curve, which is conveniently called a front, can 
be seen on Figure 9.13. We see that after the appearance of the four cusps, the 
curve consists of four sections between the cusps, two short and two long, and the 
long sections cross each other twice. Then the short sections become longer, and 
the long sections become shorter. At some moment, the "long" sections (which are 
not so long at this moment) go apart; then the "short" sections (which are quite 
long at this moment) meet and form two crossings. Then the cusps bump into each 
other and disappear, and the curve again becomes more or less elliptic. 

It is interesting to draw all the fronts of Figure 9.13 on one picture. The cusps 
of the fronts form a curve themselves (Figure 9.14), and if you compare Figure 9.14 
with Figure 9.5, you will see that our curve is nothing but the evolute of the ellipse. 

Similarly the movement of the fronts of the self-intersecting curve of Figure 
9.7 is shown on Figure 9.15. The drawing on Figure 9.16 presents the whole family 
on one picture; if you trace, mentally, the curve of cusps, you will get the two-cusp 
evolute visible on Figure 9.7. 

If you want to have more examples, observe the family of fronts of a sine wave 
(Figure 9.17); you can guess, what the evolute of a sine wave looks like (the evolute 
of a curve with inflection points always has asymptotes; these asymptotes are the 
normals to the curve at the inflection points; if the words "inflection points" and 
"asymptotes" do not mean much to you, forget about them). 

Still, all these examples do not seem to justify the statement "cusps are ev- 
erywhere around." One can argue, "if cusps are everywhere around, then why we 
don't see them?" But we do! 
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Figure 9.13. The evolution of a front 
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Figure 9.14. The fronts and the cvohitc 



To be convinced, let us look around through the eyes of a great artist. Look 
at the famous portrait of Igor Stravinsky drawn in 1932 by Pablo Picasso (Figure 
9.18). This picture of a great composer made by a great artist, which probably 
bears a notable resemblance to the original, is, actually, nothing but several dozens 
of pencil curves. But Igor Stravinsky's face was not made of curves! Then, what do 
these curves represent? And why do they stop abruptly without any visible reason? 

Certainly, this drawing is too complicated to begin thinking of such things. Let 
us consider a more simple drawing. Imagine young Pablo Picasso first entering an 
art school in his native Malaga, or, maybe, later, in Barcelona. It is very probable 
that his teacher offered him a jug to draw (art students often begin their studies 
with jugs). 

It is very unlikely that Pablo's picture of a jug, even if it ever existed, can be 
still found anywhere. But maybe it looked like one of the drawings on Figure 9.19. 

Or, maybe, the art teacher was a geometry lover, and Pablo's first assignment 
was a torus (the surface of a bagel, if you do not know what a torus is). Then 
Pablo's first drawing could look like Figure 9.20. 

On these simple drawings, we see the same things as on the masterpiece: there 
are curves, some of them end abruptly, either when they meet other curves, or 
without any visible reasons. 

Let us think about the reasons. The curves we see (and draw) are boundaries 
of visible shapes, or, in other words, they are made of points of tangency of the 
rays from our eyes to the surface we are looking at. Let us denote this surface 
by S and the curve made of tangency points by C (see Figure 9.21). If we place, 
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Figure 9.15. Evolution of fronts for a self- intersecting curve 
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FIGURE 9.16. The fronts of a self-intersecting curve 




Figure 9.17. The fronts of a sine wave 



mentally, a screen behind the surface, then our rays will trace a curve on the screen, 
and this curve C looks precisely like the contour of the surface that we see. If the 
shape of the surface S is more complicated, then some parts of the curve C may be 
hidden from our eye by the surface (geometrically this means that the ray crosses 
the surface, maybe more than one time, before the tangency). This is what happens 
where a curve stops when meeting another curve: if the things we are drawing were 
transparent, the curve would not have stopped, it would have gone further as a 
smooth curve. 

The second case, when the curve stops without meeting another curve is more 
interesting. As we have said before, the tangency points of the rays of our vision 
form a curve C on our surface S. A simple analytic argument, which we skip here, 
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FIGURE 9.18. Pablo Picasso. Portrait of Igor Stravinsky (1932) 



shows that this curve C is always smooth. However, the ray may be tangent not 
only to the surface S, but also to the curve C. In this case, the image of this 
curve on the screen, and, hence, inside our eye, or on our drawing, forms a cusp 
(see Figure 9.22). But we see only a half of this cusp, while the second half is 
hidden behind the shape. Thus, if the things around us were transparent (sounds 
Nabokovian!), we would never have seen stopping curves; we would rather have 
seen a lot of cusps, which in real life are visible only by half. 

For, example, if Pablo's jugs and torus were transparent, he would have sup- 
plemented his drawing by the curves shown (dotted) on Figure 9.23. 

In conclusion, let us look at the projections of a transparent torus. To make 
it transparent, we replace it by a dense family of circles in parallel planes. More 
precisely, a torus is a surface of revolution of a circle around an axis not crossing 
the circle (see Figure 9.24 a). We replace the circle by a dense set of points, in our 
example by the set of vertices of an inscribed regular 32-gon (Figure 9.24 b). 
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Figure 9.19. Pablo ??. Two jugs 




Figure 9.20. Pablo ???. A torus 



Four projections (under slightly different angles) along with magnifications of 
the central fragments of these projections are shown on Figures 9.25, 9.26, 9.27, 
and 9.28. The curve with four cusps is seen on each of these projections. 



LECTURE 9. CUSPS 135 




Figure 9.21. A visible (apparent) contour of a simple shape 




Figure 9.22. A visible contour of a complicated shape 




Figure 9.23. Transparent things 
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Figure 9.24. Replace a circle by a set of 32 points 



Figure 9.25. A projection of a torus 
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Figure 9.26. Another projection of a torus 




Figure 9.27. One more projection of a torus 
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Figure 9.28. The last projection of a torus 




LECTURE 10 

Around Four Vertices 

10.1 The theorem. There are results in mathematics that could have been 
easily discovered much earlier than they actually were. One example that comes 
to mind is Pick's formula from Exercise 1.1 that could have been known to the 
Ancient Greeks, but it was discovered by Georg Pick only in 1899. 

The subject of this lecture is the four vertex theorem, and the reader will agree 
with us that it also could have been discovered much earlier, say, by Huygcns or 
Newton. However, the vertex theorem was published by an Indian mathematician 
S. Mukhopadhyaya only in 1909. 

The four vertex theorem states that a plane oval has at least 4 vertices. An 
oval for us will always be a closed smooth curve with positive curvature. 1 A vertex 
of a curve is a local maximum or minimum of its curvature. That a closed curve has 
at least two vertices is obvious: the curvature attains a maximum and a minimum 
at least once. 

10.2 Caustics, evolutes, involutes and osculating circles. Consider a 
smooth curve 7 in the plane. At every point x G 7, one has a family of circles 
tangent to the curve at this point, sec Figure 10.1. Of these circles, one is "more 
tangent" than the others; it is called the osculating circle. 

The definition of the osculating circle is as follows. Let two points start moving 
in the same direction from point x with unit speed, one along the curve 7 and 
another along a tangent circle. For all tangent circles, but one, the distance between 
the points will grow quadratically with time, and only for one exceptional circle the 
rate of growth will be cubic. This is the osculating circle. 

The osculating circle can be constructed as follows. Give 7 some parameteri- 
zation so that x = 7(t). Consider three close points j(t — e),-f(t),j(t — e). There 
is a unique circle through these three points (we do not exclude the case of a line, 
a circle of infinite radius). As e — > 0, the limiting position of this circle is the 



Ovum is an egg in Latin. 
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Figure 10.1. The osculating circle of a curve 

osculating circle of 7 at the point x. One can say that the osculating circle has 
3-point contact with the curve or that it is second order tangent to the curve. 

The radius of the osculating circle is called the curvature radius and its recip- 
rocal the curvature of the curve at the given point; the center of the osculating 
circle is called the center of curvature of the curve. If the curve is given an arc 
length parameterization, that is, one moves along the curve with unit speed, then 
the curvature is the magnitude of the acceleration vector. 2 

The order of tangency of the osculating circle at a vertex of a curve is higher 
than at an ordinary point: the circle has 4-point contact with the curve at a vertex, 
that is, the osculating circle hyperosculates. 

Imagine that our curve is a source of light: rays of light emanate from 7 in 
the perpendicular direction (in the plane of 7). The envelope L of this 1-parameter 
family of normals will be especially bright; this envelope is called the caustic, 3 or 
the evolute, of the curve. See Figure 10.2 and the figures in Lecture 9. The curve 7 
is called an involute of the curve T. Evolutes and involutes are the main characters 
of this lecture. 

Lemma 10.1. The evolute of a curve is the locus of centers of curvature. A 
vertex of the curve corresponds to a singularity of the evolute, generically, a cusp. 

Proof. Let j(t) be a parameterization of the curve. The equation of the normal 
line to 7 at the point 7(f) involves the first derivative, 7'(i), and the coordinates 
of the intersection point of two infinitesimally close normals, that is, the center of 
curvature, involve the first two derivatives, -f'(t) and 7"(i) (sec the equation of an 
envelope in Section 8.3). 

This means that, computing the center of curvature of the curve at point x, one 
may replace the curve by its osculating circle which is second order tangent with 
the curve at this point. The normals of a circle intersect at its center. Hence the 
infinitesimally close normals to the curve at x intersect at the center of this circle. 

Likewise, the velocity vector of the evolute involves the first three derivatives 
of the vector valued function j(t). At a vertex, the curve is approximated by the 



2 This is known to every driver: the sharper a turn, the the harder to negotiate it. 
3 From Greek kaustikos via Latin causticus, "burning". 
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FIGURE 10.2. The evolute of a curve 

osculating circle up to three derivatives. Therefore the evolute of the curve at a 
vertex has the same velocity vector as the evolute of a circle. But the latter is a 
point! This means that the velocity of the evolute vanishes and it has a singular 
point. □ 

Let us add that, at a local maximum of curvature, the cusp of the evolute 
points toward the curve, while at a local minimum of curvature it points from the 
curve. 

It follows from Lemma 10.1 that the four vertex theorem can be reformulated 
as a four cusp theorem for the evolute. Note however that, unlike the four vertices, 
the singular points of the evolute may merge together: for example, the evolute of 
a circle is just one (very singular) point. 

Orient the normals to 7 inward. Then the smooth arcs of the evolute also get 
oriented. What happens to this orientation at a cusp is shown in Figure 10.3. It 
follows that the cusps partition the evolute into arcs with opposite orientations. 
Hence, if 7 is closed, the evolute has an even number of cusps. 




FIGURE 10.3. Orientations of cusps 

Evolutes may have cusps but they do not have inflection points. The reason 
was explained in Lecture 8. The normals to a smooth curve determine a smooth 
curve in the dual plane, and the envelope is dual to this smooth curve. An inflection 
is dual to a cusp, so the evolute is inflection-free. 

The reader who finds this argument too high-brow, may consider a more down- 
to-earth approach: should L have an inflection point, at some point of 7 there would 
be two different inward normals to 7 - see Figure 10.4. 
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Figure 10.4. Evolutes have no inflections 



Given the evolute T, can one reconstruct the initial curve? In other words, how 
to construct an involute of a curve? The answer is given by the following string 
construction. Choose a point y on T and wrap a non-stretchable string around T 
starting at y. Then the free end of the string, x, will draw an involute of T, see 
Figure 10.5. 



Proof of the string construction. We need to see that the velocity of the point 
x is perpendicular to the segment zx. Physically, this is obvious: the radial com- 
ponent of the velocity of x would stretch the string. 

The reader suspicious of this argument, will probably be satisfied by the follow- 
ing calculus proof. Give T the arc-length parameterization T(t) so that y = r(0), 



y 




Figure 10.5. The string construction of the involute 



LECTURE 10. AROUND FOUR VERTICES 



143 



and let c be the length of the string. Then z(t) = T(t) and x(t) = T(t) + (c-t)T'(t). 
Hence 

x'(t) = T'(t) T'(t) + (c - t)T"(t) = (c - t)T"(t), 

and the acceleration vector T"(t) is orthogonal to the velocity T'(t) since t is an 
arc-length parameter. □ 

Note that the string construction yields not just one but a whole 1-parameter 
family of involutes: the parameter is the length of the string. Every two involutes 
are equidistant: the distance between them along their common normals is constant. 
The relation between involutes and evolutes resembles the one between functions 
and their derivatives: recovering a function from its derivative involves a constant 
of integration. 

The string construction implies the following property. 

Corollary 10.1. The length of an arc of the evolute T equals the difference 
of its tangent segments to the involute 7, that is, the increment of the radii of 
curvature 0/7. 

Consider the evolute of a plane oval. Let us adapt the convention that the sign 
of the length of the evolute changes after each cusp; this convention makes sense 
because the number of cusps is even. 

Lemma 10.2. The total length of the evolute is zero. 




Figure 10.6. The total length of the evolute is zero 

Proof. Consider Figure 10.6. If the radii of curvature are r\,R\,T2,R2 then, 
according to Corollary 10.1, the arcs of the evolute have lengths R\ — r\,R\ — 
T11R2 — r 2 and R2 — ri, and their alternating sum vanishes. The general case is 
similar. □ 

The zero length property, of course, again implies the existence of cusps of the 
caustic of an oval, but this is hardly new to us. 

Consider an arc 7 with monotonic positive curvature and draw a few osculating 
circles to it. Most likely, your picture looks somewhat like Figure 10.7. This is 
wrong! A correct picture is shown in Figure 10.8, as the next Tait-Kneser's theorem 
shows. 
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Figure 10.7. A wrong picture of osculating circles 




Figure 10.8. Osculating circles s±, 82,83,84 to a curve 7 are 
shown. Although the circles si, S2, S3, S4 are all disjoint, they seem 
tangent to each other. No wonder: the shortest distance between 
the circles si and S2 is approximately 0.2% of the radius of si, and 
the shortest distance between the circles Si and s 4 is approximately 
5% of the radius of s\. 



Theorem 10.2. The osculating circles of an arc with monotonic positive cur- 
vature are nested. 

Proof. Consider Figure 10.9. The length of the arc ziz 2 equals r\ — r 2 , hence 
I ziz 2 1 < r\ — r 2 . Therefore the circle with center z\ and radius r\ contains the circle 
with center Z2 and radius r 2 . □ 



LECTURE 10. AROUND FOUR VERTICES 



145 




FIGURE 10.9. Proof of the Tait-Kneser theorem 10.2 

Consider Figure 10.10 featuring a spiral 7 and a 1-parameter family of its 
osculating circles. The circles are disjoint and their union is an annulus. 4 This 
figure is quite amazing (although this may not be obvious from the first glance)! 




Figure 10.10. An annulus filled with disjoint osculating circles 

First of all, Figure 10.10 depicts 16 circles; the curve which you see "snaking" 
between the circles is not drawn - but you clearly sec it as the envelope of the circles. 
Secondly, the partition of the annulus into disjoint circles is quite paradoxical, in 
the following sense. 

Proposition 10.3. If a differentiable function in the annulus is constant on 
each circle then this is a constant function. 5 

Thinking about what this proposition claims, the reader is likely to conclude 
that it cannot be true. For example, there is an obviously non-constant function on 
the annulus assiging to each point the radius of the circle passing through this point. 
This function is clearly constant on the circles and non-constant on the annulus. 
This function is so natural that it is hard to believe that it is not differentiable. 
But differentiable it is not! 



4 In technical terms, the annulus is foliated by circles. 

5 In technical terms, the foliation is not differentiable - although its leaves are perfect circles. 
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Proof. If a function / is constant on the circles, its differential vanishes on the 
tangent vectors to these circles. Since the curve 7 is tangent to some circle at every 
point, the differential df vanishes on 7. Hence / is constant on 7. But 7 intersects 
all the circles that form the annulus, so / is constant everywhere. □ 

Remark 10.4. Most mathematicians are brought up to believe that things 
like non-differentiablc functions do not appear in "real life" , they arc invented 
as counterexamples to reckless formulations of mathematical theorems and belong 
to books with titles such as "Counterexamples in analysis" or "Counterexamples 
in topology". Proposition 10.3 provides a perfectly natural example of such a 
situation, and there is nothing artificial about it at all. 

10.3 Proof of the four vertex theorem. A plane oval 7 can be described 
by its support function. Choose the origin O, preferably inside 7. Given a direction 
</>, consider the tangent line to 7 perpendicular to this direction, and denote by p(4>) 
the distance from this tangent line to the origin, see Figure 10.11. If the origin is 
outside of the oval, this distance will be signed. 




Figure 10.11. The support function of an oval 

The support function uniquely determines the family of tangent lines to 7, and 
therefore the curve itself, as the envelope of this family. One can express all the 
interesting characteristics of 7, such as the perimeter length or the area bounded 
by it, in terms of p(4>), but we do not need these formulas. 

What we need to know is how the support function depends on the choice of 
the origin. Let O' = O + (a, b) be a different origin. 

Lemma 10.5. The new support function is given by the formula 

(10.1) p = p — a cos <fi — b sin <j>. 

Proof. Every parallel translation can be decomposed into one in the direction 
of and one in the orthogonal direction. For a translation distance r in the former 
direction, a = rcoscf>,b = rsin0, and (10.1) gives p' = p — r, as required. For a 
translation in the orthogonal direction, a = — r sin cj>,b = rcos</>, and (10.1) gives 
p' = p, as required. □ 
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What are the support functions of circles? If a circle is centered at the origin, 
its support function is constant. By the previous lemma, the support functions of 
circles are linear harmonics, the functions 

p(4>) = c + a cos <p + b sin (j). 

We can now characterize vertices in terms of the support function. 

Lemma 10.6. The vertices of a curve correspond to the values of 4> for which 

(10.2) p"\4>)+p'{4>)=0. 

Proof. Vertices are the points where the curve has the third-order contact with 
a circle. In terms of the support functions, this means that p(4>) coincides with 
a cos 4>+b sin (fi+c up to the third derivative. It remains to note that linear harmonics 
satisfy (10.2) identically. □ 

Thus the four vertex theorem can be reformulated as follows. 

Theorem 10.3. Let p(<fi) be a smooth 2-K-periodic function. Then the equation 
p"'(<t>) +p'{4>) = has at least 4 distinct roots. 

Proof. A function on the circle changes sign an even number of times. The 
mean value of the function / = p'" + p' is zero since it is the derivative of p" + p, 
hence / changes sign at least twice. Assume that / changes sign exactly twice, at 
points (f> = a and </> = /?. 

On can find constants a, b, c such that the linear harmonic g((f>) — c+a cos (f> + 
bsin<fi changes sign exactly at the same points, a and /?, and so that / and g have 
the same signs everywhere. For example, 

±g(<j>) = cos — — - cos 



Then 

flit 



Jo 



f(4)g(4>) dcf> > 0. 

/o 

On the other hand, integration by parts yields: 

r 2n pl-K r2n 



/ (p'" + p')g d4> = - (p" + P)g' d^ = \ {p'g" - pg 1 ) 
Jo Jo Jo 

= p(g"' + g')d<f> = 

Jo 

since g'" + g' = 0. This is a contradiction. □ 



Remark 10.7. Theorem 10.3 has generalizations. A smooth 27r-periodic func- 
tion has a Fourier expansion 

f(<j>) = (a/; cos k<p + bk sin k<f>) . 

k>0 

The Fourier series of the function / = p'" + p' does not contain linear harmonics. 
The Sturm- Hurwitz theorem states that if the Fourier expansion of a function starts 
with n-th harmonics, that is, k > n in the sum above, then this function has at 
least 2n distinct zeroes on the circle [0,27r). The above proof can be adjusted to 
this more general set-up; other proofs are known as well, see, e.g., [57] and Exercise 
10.6. 
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10.4 Two other proofs. Like almost every good mathematical result, the 
four vertex theorem has a number of different proofs. We present two geometrical 
ones in this section. We know many other proofs, using different ideas and gener- 
alizing in different directions; one would be hard pressed to choose "proof from the 
Book" 6 for the four vertex theorem. 

First proof. [79]. Consider the evolute T of an oval 7. According to Lemma 
10.2, the length of L is zero, and hence it has at least two cusps. Assume that the 
(even) number of cusps is exactly two. 



n(x) = k + 2 




Figure 10.12. The number of tangents to a curve 

Given a point x in the plane, let n(x) be the number of normals to 7 that pass 
through this point. In other words, n(x) equals the number of tangent lines from 
x to L. This function is locally constant in the complement of the evolute. When 
x crosses T from the concave to the convex side, the value of n(x) increases by 2, 
see Figure 10.12. 

n = 2 




Figure 10.13. Proof of the four vertex theorem 

For every point x, the distance to 7 has a minimum and a maximum. Therefore 
there arc at least two perpendiculars from x to the oval, and hence n{x) > 2 for 
every x. Since the normals to 7 turn monotonically and make one complete turn, 
n(x) = 2 for all points x, sufficiently far away from the oval. 

Consider the line through two cusps of the evolute; assume it is horizontal, see 
Figure 10.13. Then the height function, restricted to T, attains either minimum, or 

6 Paul Erdos used to refer to "The Book" in which God keeps the most elegant proofs of 
mathematical theorems, see [2]. 



LECTURE 10. AROUND FOUR VERTICES 



149 



maximum, or both, not in a cusp. Assume it is a maximum (as in Figure 10.13); 
draw the horizontal line I through it. Since the evolute lies below this line, n(x) = 2 
above it. Therefore n(x) — immediately below I. This is a contradiction proving 
the four vertex theorem. □ 

Outline of the second proof. This follows ideas of the famous French mathe- 
matician R. Thorn. This is a beautiful argument indeeed! 

For every point x inside the oval 7, consider the closest point y on the oval. Of 
course, for some points x, the closest point is not unique. The locus of such points 
is called the symmetry set; denote it by A. For example, for a circle, A is its center, 
and for an ellipse, A is the segment between the two centers of maximal curvature. 
For a generic oval, A is a graph (with curved edges), and its vertices of valence 1 
are the centers of local maximal curvature of 7, see Figure 10.14. 

y 




Figure 10.14. The symmetry set of an oval 

The last claim needs an explanation. It is clear that the vertices of A of valence 
1 are the centers of extremal curvature (where two points labeled y in Figure 10.14 
merge together). But why not centers of minimal curvature? This is because an 
osculating circle of minimal curvature locally lies outside of the curve 7. Therefore 
the distance from the center of such a circle to the curve is less than its radius and 
hence its center does not belong to the symmetry set A. 

Delete the symmetry set from the interior of 7. What remains can be contin- 
uously deformed to the boundary oval by moving every point x toward the closest 
point y. Hence the complement of A is an annulus, and therefore A has no loops 
(and consists of only one component). Thus A is a tree which necessarily has at 
least two vertices of valence 1. It follows that the curvature of the oval has at least 
two local maxima, and we are done. □ 

10.5 Sundry other results. Since its discovery about a century ago, the 
four vertex theorem and its numerous generalizations keep attracting the attention 
of mathematicians. In this last section we describe, without proofs, a few results 
around four vertices (see, e.g., [57]). 
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One way to generalize the four vertex theorem is to approximate a smooth plane 
curve not by its osculating circles but by some other type of curves, for instance, 
conies. There exists a unique conic through every generic 5 points in the plane. 
Taking these points infinitesimally close to each other on an oval, one obtains its 
osculating conic. Just like an osculating circle, this conic may hyperosculate: such 
a point of the curve is called sextactic (or, for reasons that we will not discuss here, 
an affinc vertex) . What is the least number of sextactic points of a plane oval? The 
answer is six, and this was proved by Mukhopadhyaya in his 1909 paper. 

One can also approximate a curve by its tangent lines. Then we are interested 
in inflections, the points where the tangent line is second-order tangent to a curve. 
Of course, an oval does not have inflections. However, consider a simple closed 
curve in the projective plane. Recall from Lecture 8 that the projective plane is the 
result of pasting together pairs of antipodal points of the sphere. A closed curve 
in the projective plane can be drawn on the sphere, either as a closed curve or as 
a curve whose endpoints are antipodal, and an inflection is its abnormal tangency 
with a great circle. 

Assume that our curve belongs to the latter type, that is, its endpoints are 
antipodal. Then a Mobius theorem (1852) states that the curve has at least three 
inflections. See Figure 10.15 where the sphere is projected on a plane from its 
center, and so the curves appear to escape to infinity; this figure features a simple 
curve with 3 inflections and a self-intersecting curve with only 1 inflection. 




Figure 10.15. Inflection points of a curve in the projective plane 

By the way, there is another, much more recent, result about spherical curves. 
Assume that a smooth closed simple curve bisects the area of the sphere. Then it 
has at least 4 inflection points. This result, which V. Arnold called the "tennis ball 
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theorem", 7 was discovered by B. Segre (1968) and rediscovered by Arnold in the 
late 1980s. 

One may also approximate a smooth curve by a cubic curve. An algebraic curve 
of degree 3 is determined by 9 of its points, and the osculating cubic of a smooth 
curve passes through 9 infinitesimally close points. A cubic is hyperosculating if 
it passes through 10 such points, and the respective point of the curve is called 
3-extactic (in this, somewhat cumbersome, terminology, 2-extactic=sextactic and 
l-extactic=inflection) . 

A typical cubic curve looks like one of the curves in Figure 10.16; in the latter 
case, its bounded component is called an oval. Recently V. Arnold discovered the 
following theorem: a smooth curve, obtained by a small perturbation of the oval 
of a cubic curve, has at least ten 3-extactic points. It is tempting to continue by 
increasing the degree of approximating algebraic curves but, to the best of our 
knowledge, no further results in this direction are available. 




Figure 10.16. Cubic curves 



In the early 1990s, E. Ghys discovered the following beautiful theorem. The 
real projective line is obtained from the real line by adding one point "at infinity". 
This extension has clear advantages. For example, a fractional-linear function 

_ ax + b ac i — o 
JK ' cx + a" r 

is not a well-defined function of the real variable: if x = —d/c then f{x) = oo; but 
/ regains its status of a well-defined and invertiblc function from the real projective 
line to itself (/(oo) = a/c). 

Let f(x) be a smooth invertiblc function from the real projective line to itself. 
At every point x, one can find a fractional- linear function whose value, whose first 
and whose second derivatives coincide with those of / at the point x. It is natural 
to call this the osculating fractional-linear function. A fractional-linear function is 
hyperosculating at x if its third derivative there equals f"{x) as well. How many 
hyperosculating fractional-linear functions are there for an arbitrary /? According 
to the Ghys theorem, at least four. In terms of the function /, these points are the 



7 Every tennis ball has a clearly visible curve on it surface which has exactly four inflections. 
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roots of a rather intimidating expression 

fix) 3 ff"(x)\ 2 
/'(*) 2\f>(x)J ' 

called the Schwarzian derivative of /. 

Another direction of generalizing the four vertex theorem is to replace a smooth 
curve by a polygon. This discretization may be performed in different ways, leading 
to different results. We mention only one, probably the oldest. This is the Cauchy 
lemma (1813) which plays the central role in Cauchy's celebrated proof of the 
rigidity of convex polyhedra, described in Lecture 24: given two convex polygons 
whose respective sides are congruent, the cyclic sequence of the differences of their 
respective angles changes sign at least four times. 




Figure 10.17. Positive and negative self-tangencies 



Finally back to vertices. It has been known for a long time that the four vertex 
theorem holds for non-convex simple closed curves as well. V. Arnold conjectured 
that one could extend it much further. Starting with an oval, one is allowed to 
deform the curve smoothly and even to intersect itself; the only prohibited event is 
when the curve touches itself so that the touching pieces have the same orientations 
as in Figure 10.17. According to Arnold's conjecture, the four vertex theorem holds 
for every curve that can be obtained from an oval as a result of such a deformation; 
see Figure 10.18 for a sample. 




Figure 10.18. An allowed deformation of an oval 
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Recently Yu. Chckanov and P. Pushkar' proved this conjecture using ideas 
from contemporary symplectic topology and knot theory [15]. 
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10.6 Exercises. 

10.1. (a) Draw involutes of a cubic parabola, 
(b) Draw involutes of the curve in Figure 10.19. 




Figure 10.19 



10.2. A cycloid is the curve traversed by a point of a circle that rolls without 
sliding along a horizontal line. Describe the evolute of a cycloid. 
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10.3. Compute the curvature of a semi-cubic parabola at the cusp. 

10.4. (a) Express the perimeter length and the area of an oval in terms of its 
support function. 

(b) Parameterize an oval 7 by the angle <f> made by its tangent with a fixed 
direction, and let p(4>) be the support function. Prove that 

7(0) = (p(<p) sincj) + p'{4>) cos0, — p{4>) cos <j> + p' (<f>) syd. (f>) . 

(c) Show that the radius of curvature of equals p"((f>) +p(<f>). 

10.5. Let / be a smooth function of one real variable. The osculating (Taylor) 
polynomial gt(x) of degree n of the function f{x) at point t is the polynomial, whose 
value and the values of whose first n derivatives at point t coincide with those of /: 



Assume that n is even and f^ n+1 \t) ^= on some interval I (possibly, infinite). 
Prove that for any distinct a and b from the interval I, the graphs of the osculating 
polynomials g a {%) and gb{x) are disjoint. 

Comment: this theorem strongly resembles the Tait-Kneser theorem 10.2. 

10.6. * Consider a trigonometric polynomial 
f(x) = a,k cos kx + bk sin kx + a,k+i cos(fc + l)x + bk+i sin(fc + l)x + . . . 
+a n cos nx + b n sin nx 
where k < n. Prove that / has at least 2k roots on the circle [0,27r]. 

Hint. Let I be the inverse derivative of a periodic function with the integration 
constant chosen so that the average value of the function is zero. Denote the 
number of sign changes of a function / by Z(f). Then Rolle's theorem states that 
Z(f) > Z(I(f)). Iterate this inequality many times and investigate how / changes 
under the action of /. 




i=0 




LECTURE 11 

Segments of Equal Areas 

11.1 The problem. The main "message" of Lecture 9 was that cusps are 
ubiquitous: every generic 1-paramctcr family of curves has an envelope, and this 
envelope usually has cusps. This lecture is a case study: we investigate, in detail, 
one concrete family of lines in the plane. 

Let 7 be a closed convex plane curve. Fix a number < t < 1. Consider the 
family of oriented lines that divide the area inside 7 in the ratio t : (1 — t), the t-th 
portion on the left, and the (1 — t)-th on the right of the line. This 1-parameter 
family of lines has an envelope, the curve L t . 

Let us make an immediate observation: the curves T t and Li_ t coincide. Thus 
we may restrict ourselves to < t < 1/2. The curves L t are our main objects of 
study. 



11.2 An example. Let us start with a simple example (which still might be 
familiar to some of the readers from high school 1 ). Consider the family of lines that 
cut off a fixed area A from a plane wedge, and let L be the envelope of this family 
of lines. 

Theorem 11.1. The curve T is a hyperbola. 

Proof. Apply an area preserving linear transformation that takes the given 
wedge to a right angle. We consider the sides of the angle as the coordinate axes. 
It suffices to prove the theorem in this case. 

Let f(x) be a differentiablc function. The tangent line to the graph y = f(x) 
at point (a, /(a)) has the equation y = f'(a)(x — a) + f(a). The x- and y-intercepts 
of this line are 

a -77TT and f(a)-a.f(a). 



1 Admittedly, an over-optimistic statement. 
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Consider the hyperbola y = c/x. In this case, the x- and y-intercepts of the 
tangent line at point (a, c/a) are 2a and 2c/ a. Therefore the area of the triangle 
bounded by the coordinate axis and this tangent line is 2c, a constant. 

This proves that the tangent lines to a hyperbola cut off triangles of constant 
areas from the coordinate "cross" . Formally, this is not yet what the theorem 
claims. To finish the proof, choose the constant c in the equation of the hyperbola 
y = c/x so that the areas in question equals A. Then this hyperbola is the curve T 
from the formulation of the theorem, the envelope of the lines that cut off triangles 
of area A from the coordinate axes. □ 

11.3 The envelope of segments of equal areas is the locus of their 
midpoints. The title of this section is a formulation of a theorem. The theorem 
asserts that the curve T t is tangent to the segments that cut off t-th portion of the 
total area in their midpoints. 




Figure 11.1. Proving that the envelope of segments of equal areas 
is the locus of their midpoints 

Proof of Theorem. Consider Figure 11.1. Let AB and A'B' be two close seg- 
ments from our family and e the angle between them. Since both segments cut off 
equal areas from the curve 7, the areas of the sectors AO A' and BOB' are equal. 
The areas of these sectors are approximately equal to (l/2)|ylO| 2 e and (l/2)\BO\ 2 e, 
with error of order e 2 . Therefore \AO \ — \BO\ tends to zero as e —* 0. □ 

As an application, let us solve the following problem: given two nested ovals, 
is there a chord of the outer one, tangent to the inner one and bisected by the 
tangency point? See Figure 11.2. Simple as it sounds, this problem is not easy to 
solve, unless one uses the above theorem. 

Solution to the problem. Let I be the tangent segment to the inner curve that 
cuts off the smallest area from the outer one. Denote by S be the value of this area. 
Consider two close segments £' and £" that cut off area S from the outer oval. Let 
A be the tangency point of i with the inner oval, and B and C the intersections of 
£ with £' and £" , see Figure 11.3. 
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Figure 11.2. A problem on two ovals 




Figure 11.3. Solving the problem on two ovals 

Since S is the minimal area, segments £' with I" do not contain points inside 
the inner oval. Hence point A lies between points B and C . As £' and £" tend to 
£, points B and C tend to A. Thus A is the tangency point of segment I with the 
envelope of the segments that cut off area S from the outer oval. By the above 
theorem, A bisects I. 

Of course, one can repeat the argument, replacing the minimal area by the 
maximal one. As a result, there are at least two chords, tangent to the inner oval 
at their midpoints. 

11.4 Digression: outer billiards. One is naturally led to the definition of 
an interesting dynamical system, called the outer (or dual) billiard. Unlike the 
usual billiards, discussed in Lecture 28, the game of outer billiard is played outside 
the billiard table. 

Let C be a plane oval. From a point x outside C, there exist two tangent lines 
to C. Choose the right one, as viewed from x, and reflect x in the tangency point. 
One obtains a new point, y, and the transformation that takes x to y is the outer 
billiard map, see Figure 11.4. 

There are many interesting things one can say about outer billiards, see [23, 
78, 83] for surveys. We will establish but a few fundamental properties of outer 
billiards. 
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X 



Figure 11.4. Outer billiard transformation 

Theorem 11.2. For every oval C , the outer billiard map is area preserving. 

Proof. Consider two close tangent lines to the curve C, pick points Xi,x 2 and 
x' l7 x' 2 on these lines, and let j/i , y 2 , y'i , y'2 be their images under the outer billiard 
map, see Figure 11.5. The outer billiard map takes the quadrilateral x\x 2 x' 2 x' x to 
2/12/22/22/i ■ 




Figure 11.5. Area preserving property of outer billiards 

Denote by O the intersection point of the lines and let e be the angle between 
them. Arguing as in Section 11.3, the areas of the triangles X\Ox\ and y\Oy' x are 
equal, up to error of order e 2 , and likewise for the triangles x 2 Ox' 2 and y 2 Oy 2 . 
Hence, up to the same error, the areas of the quadrilaterals xiX2x' 2 x' 1 and y^yWi 
are equal. In the limit e — > 0, we obtain the area preserving property. □ 

Here is another question about outer billiards. Given a plane oval C, is there 
an n-gon, circumscribed about C, whose sides are bisected by the tangency points? 
Such a polygon corresponds to an rt-periodic orbit of the outer billiard about C. 

The answer to this question is affirmative. Indeed, consider the circumscribed 
n-gon of the minimal area. Then, arguing as in the solution to the problem in 
Section 11.3, each side of this polygon is bisected by its tangency points to C. The 
same argument applies to star-shaped n-gons, see Figure 11.6 for three types of 
septagons. 2 

11.5 What the envelope has and what it has not. The envelopes T t do 
not have double tangent lines and inflection points. Indeed, if a curve has a double 
tangent line or an inflection point then it has two close parallel tangents, see Figure 
11.7, and these parallel tangents cannot cut off the same areas from the curve 7. 

2 In fact, for every n > 3 and every 1 < r < n/2, coprime with n, there are at least two 
circumscribed n-gons, making r turns around the oval C, whose sides are bisected by the tangency 
points, sec [78, 83]. 
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Figure 11.6. Three types of circumscribed sept agons 




FIGURE 11.7. The envelope does not have double tangents 



What the envelope of segments of equal areas may have are cusps. The following 
theorem tells us when T t has a cusp. We assume that 7 is an oval. 

Theorem 11.3. If the midpoint of a chord AB of the curve 7 is a cusp of the 
envelope of segments of equal areas then the tangent lines to 7 at points A and B 
are parallel. 

Proof. Let O be the midpoint of AB. Since O is a cusp of the envelope of 
segments of equal areas, the velocity of point O is zero, and the instantaneous 
motion of the line AB is revolution about point O. Since O is the midpoint of the 
segment AB, the velocity vectors of points A and B are symmetric with respect to 
point O, and hence the tangent lines to 7 at the points A and B are parallel. □ 

Suppose that the tangent lines to 7 at points A and B are parallel. To describe 
the behavior of the envelope of segments of equal areas r t we need additional 
information about the curvature of the curve 7 at points A and B. Assume that 
the curvature at B is greater. 

Theorem 11.4. The envelope T t has a cusp pointing toward B. 

Proof. Consider Figure 11.8, left: 71 and 72 are pieces of the curve 7 near points 
A and B; O is the midpoint of segment AB; and 71 is symmetric to 71 with respect 
to O. 

Draw a chord CD' through point O, close to AB. Since the curvature of 72 is 
greater than that of 71 , the area of the sector AOC is greater than that of BOD' 
(it is equal to the area of the sector BOM, symmetric to AOC). Therefore the 
segment CD, that divides the area of 7 in the same ratio as AB, lies to the right 
of CD'. The midpoint of CD is close to point O' , the intersection of the segments 
AB and CD. Similar observations can be made concerning the segments E'F' and 
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EF. Thus, the envelope T t tangent to AB, CD, and EF and passing through O, 
must have a cusp at O, pointing toward point B (see Figure 11.8, right). □ 




B M 

Figure 11.8. Cusp of the envelope 

And what if the curvatures at points A and B are equal? This event will not 
happen for a generic curve 7. Indeed, we assume three conditions to hold: the 
segment AB divides the area in the ratio t : (1 — t), the tangent lines at A and B 
are parallel, and the curvatures at A and B are equal. But a pair of points A and 
B on the curve 7 have only two degrees of freedom, so for three conditions to hold 
is too much to expect. 

However, if one allows the parameter t to vary, then one may encounter a chord 
AB of 7 with parallel tangent lines and equal curvatures at the endpoints. We will 
refer to this situation as the case of maximal degeneracy. In fact, cases of maximal 
degeneracy are bound to happen. 

Lemma 11.1. Any oval 7 has a chord with parallel tangent lines and equal 
curvatures at the endpoints; the number of such chords is odd. 

Proof. For every point A of 7 there exists a unique "antipodal" point B such 
that the tangents at A and B are parallel. Assume that the curvature at A is 
greater than that at B. Let us move point A continuously toward B; its antipodal 
point will move toward A. After the points A and B have switched, the curvature 
at the first point is smaller than that at the second. Therefore the curvatures at 
the two points were equal somewhere in-between. Furthermore, the total number 
of sign changes of the difference between the curvatures at points A and B is odd, 
as claimed. □ 

11.6 How many cusps are there? How many cusps does the envelope T t 
have? The answer depends on whether t — 1/2 or not. 

Theorem 11.5. The number of cusps ofT\/2 is odd and not less than 3. 3 



3 Compare with the Mobius theorem mentioned in Lecture 10 
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Proof. For every direction, there is a unique non- oriented line that bisects the 
area bounded by the curve 7. Hence, after traversing T 1 ^ 2 , its tangent line makes 
a turn through 180°. How can this be? Each time one passes a cusp, the tangent 
direction switches to the opposite, see Figure 11.9. This means that the total 
number of cusps is odd. 




I 

Figure 11.9. A cusp reverses the direction 

The number of cusps of T 1 / 2 is not one. Arguing by contradiction, assume there 
is a single cusp with a vertical tangent line. Then Ti/ 2 has no other vertical tangent 
lines. Left of the cusp, the smooth curve Ti/ 2 moves to the left, and right of the 
cusp - to the right. Such a curve cannot close up, sec Figure 11.10; a contradiction. 
□ 




Figure 11.10. Proving that there are at least 3 cusps 

If t 7^ 1/2 then, for every direction, there is a unique oriented line that divides 
the area bounded by the curve 7 in the ratio t : (1 — t). Hence, after traversing T t , 
its tangent line makes a turn through 360°. It follows that the number of cusps is 
even. 

11.7 All in one figure. Figure 11.11 depicts the family of envelopes T t as t 
varies from to 1/2. The delta-shaped curve at the center is Ti/ 2 . 

The main new observation is that the cusps of the curves T t lie on a new curve, 
A (shown as a dashed curve in Figure 11.11), the locus of midpoints of the chords 
with parallel tangents at the endpoints. 

The curve A also has cusps! These are the points where cusps of the envelopes 
r t appear or disappear in pairs, and these are precisely the points of maximal 
degeneracy, introduced in Section 11.5. 
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Figure 11.11. The family of envelopes T t 



How many cusps does the curve A have? Asking this question, we assume, 
as we always do when dealing with cusps, that our curves are sufficiently generic: 
otherwise A might degenerate even to a single point - this is the case when the 
original oval is a circle or an ellipse. 

Theorem 11.6. The number of cusps of A is odd and not less than 3. 

Proof. That the number of cusps is odd, follows from Lemma 11.1. We claim 
that the number of cusps of A is not less than that of Ti/ 2 , that is, by Theorem 
11.5, not less than 3. 

Let k be the number of cusps of Ti/ 2 - Then, for e small enough, the curve 
r'i/2-e nas 2fc cusps - see Figure 11.11. On the other hand, for e small enough, 
the curve T E is smooth. Therefore, as t varies from 1/2 — e to e, all 2k cusps must 
pairwise vanish at points of maximal degeneracy, that is, cusps of A. Thus A has 
at least k cusps, and since k > 3, the result follows. □ 

11.8 Polygons. Of course, a convex polygon is not an oval, but one can ap- 
proximate it by a smooth strictly convex curve: almost flat arcs along sides and 
sharp turns at the vertices. 




FIGURE 11.12. For a triangle, the curve A is a homothctic triangle 
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Let us start with a triangle. A pair of points A, B with parallel tangents is a 
vertex of the triangle and any point of the opposite side. The locus A of midpoints 
of such segments AB is a triangle, similar to the given one with coefficient -1/2, 
see Figure 11.12. The vertices of A lie in the middles of the sides of the original 
triangle, therefore all the envelopes T t have cusps (even for very small values of t) . 
The curves T t are made of arcs of hyperbolas: this follows from Theorem 11.1. 



Figure 11.13. The envelopes T t for a regular pentagon 

The latter holds for every convex polygon: the envelopes T t are piece- wise 
smooth curves, made of arcs of hyperbolas. Notice that at the conjunction points 
between different hyperbolas, the two hyperbolas have the same tangent line. The 
directions of the two hyperbolas may be either opposite (in which case the conjunc- 
tion point looks like a cusp) or the same (in which case the curve looks smooth, 
although the two hyperbolas have, in general, different curvatures). 

Let us call a vertex A of a convex polygon opposite to a side a if the line 
through A, parallel to a, lies outside of the polygon. Every side is opposite to a 
unique vertex (we assume that the polygon has no parallel sides) , but a vertex may 
be opposite to a number of sides, or to none, for that matter. 

Similarly to triangles, a pair of points A, B with parallel tangents is a vertex 
A of the polygon and points B of the opposite side; the locus of midpoints of such 
segments AB is a segment, parallel to the side at half the distance to A. The union 
of these segments is a (possibly, self-intersecting) polygon A, and this is the locus of 
cusps of all the envelopes r t . The vertices of the polygon A are points of maximal 
degeneracy. Sec Figure 11.13 for the case of a regular pentagon. The curve A in 
this case is a star-like selfintersecting pentagon; it is shown in Figure 11.13 dashed. 
The loci of other conjunctions of hyperbolas making the curves T t are also shown 
dashed. A magnified version of the central part of Figure 11.13 is shown on Figure 
11.14. 

Let us finish with a comment on the curious difference between odd- and even- 
gons. For an ra-gon with odd n and close to a regular one, the envelope Ti/ 2 of 
lines that bisect the area has n cusps and resembles a regular n-pronged star. A 
regular n-gon with n even is centrally symmetric, and the curve T 1 / 2 degenerates 
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Figure 11.14. The central part of Figure 11.13 




Figure 11.15. The envelopes T t for a quadrilateral 

to a point, the center of the polygon. After a small perturbation, one obtains 
an "honest" curve Ti/ 2 with fewer than n cusps (the number of cusps is odd by 
Theorem 11.5). For example, for a quadrilateral, this curve is a "triangle" made 
of three arcs of hyperbolas, making cusps at the vertices (see Figure 11.15 for the 
family of curves T t in the case of a generic quadrilateral). 

11.9 Exercises. 

11.1. Given an oval, show that there exists a line that bisects its area and its 
perimeter length. 

11.2. Consider two nested convex bodies with smooth boundaries (ovaloids) in 
space. Prove that there exist at least two planes, tangent to the inner body and 
such that the tangency point is the center of mass of the intersection of the plane 
with the outer body. 
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Hint. Consider the plane that cuts off the maximal volume. 

11.3. Given an oval 7, prove that there exists at least 3 pairs of points on 7 at 
which the tangent lines are parallel and the curvatures are equal. 

Hint: Show that the number of such pairs is odd. Show that, in terms of the 
support function p(a), we are interested in those a for which 

p(a) + p"(a) — p(a + n) — p" {a + n) = 0. 

Then argue as in Section 10.3. 

11.4. The center symmetry set of an oval is the envelope of the family of chords 
connecting pairs of points where the tangents to the oval are parallel, see [34]. 

(a) Prove that the center symmetry set of a generic oval has no inflections or 
double tangents but has an odd number of, and not fewer than three, cusps. 

(b) Consider a chord A\A 2 connecting two points of an oval with parallel tan- 
gents. Let k\ and k 2 be the curvatures of the oval at points A 1 and A 2 . Prove that 
A1A2 is divided by the tangency point with the center symmetry set in the ratio 
k 2 : k\. 

(c) Show that the cusps of the center symmetry set correspond to the case when 
h = k 2 . 

(d) If an oval has constant width then the center symmetry set coincides with 
its e volute. 

11.5. Prove that, for a quadrilateral which is not a parallelogram, all the curves 
T t have cusps. 

11.6. (a) How many lines can there pass through a given point that bisect the 
area of a given triangle? 

(b) Same question for a quadrilateral without parallel sides. 

11.7. Let P be a convex n-gon without parallel sides. Prove that if n is even 
then the correspondence side opposite vertex is not one-to-one. 




LECTURE 12 

On Plane Curves 

12.1 Double points, double tangents and inflections. The topic of this 
lecture is smooth plane curves, like the one in Figure 12.1. Points of self-intersection 
are called double points; the curve in Figure 12.1 has three. 




Figure 12.1. A plane curve 

A double tangent is a line that touches the curve at two differrent points. We 
distinguish between outer and inner double tangents: for the former, the two small 
pieces of the curve lie on one side of the tangent line, and for the latter - on opposite 
sides, see Figure 12.2. It may be not immediately clear, but the curve in Figure 
12.1 has 8 outer and 4 inner double tangents. 

We shall be also interested in inflection points. Give a curve an orientation. 
When moving along the curve, one is turning either left or right. The inflection 
points are the points where the direction of this rotation changes to the opposite. 
The "left" and "right" segments of the curve alternate, hence the total number 
of inflection points of a closed curve is even. The curve in Figure 12.1 has two 
inflections. 
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Figure 12.2. Two kinds of double tangent lines 




Figure 12.3. Eliminating a triple point by a small perturbation 

We are interested in typical properties of curves which are not destroyable by 
small perturbations. For example, a curve may pass though the same point thrice, 
but this event is not typical: a small perturbation replaces the triple point by three 
double points, see Figure 12.3. Likewise, a double tangent may touch the curve a 
third time, but this is not typical either, see Figure 12.4. 



Figure 12.4. Eliminating a triple tangent line by a small perturbation 

There are many other non- typical events that we exclude, such as a double 
tangent line passing through a double point, or a self-tangency of the curve, etc. 
We always assume our curves to be generic. 

12.2 Drawing doodles: the Fabricius-Bjerre formula. Let T + and T 

be the number of outer and inner double tangents of a smooth closed curve, / its 
(even) number of inflections and D the number of double points. These numbers 
are not independent: there is a universal relation between them described in the 
following theorem. 

Theorem 12.1. For every generic smooth closed curve, one has: 



For example, for the curve in Figure 12.5, T + = 5, T_ = 2, 1 = 2, D = 2. 

Formula (12.1) was found by the Danish mathematician Fabricius-Bjerre in 
1962 [28]. Drawing doodles is a natural human activity, enjoyed by millions of chil- 
dren around the world, and this beautiful result could easily have been discovered 
much earlier! 

Proof. Orient the curve and consider its positive tangent half-line at point x. The 
number of intersection points N of this half-line with the curve depends on the 




(12.1) 




= D. 
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Figure 12.5. An example of the Fabricius-Bjerre formula: T + = 
5, 71 = 2, 7 = 2, D = 2 

point. As x traverses the curve, this number changes, and when x returns to the 
initial position, this number N assumes the original value. 

When does N change? When x passes a double point, N decreases by 1. Since 
each double point is visited twice, the total contribution of double point to the 
increment of N is —2D. When x passes an inflection point, N also decreases by 1, 
hence the total contribution of inflections is —I, see Figure 12.6. 



Figure 12.6. Two cases when N changes 

A double tangent contributes ±2, depending on whether it is outer or inner. 
More precisely, there are 6 cases, depending on the orientations, shown in Figure 
12.7. Their total contributions to N is 2T' + + 4T£ - 2T'_ - AT". 

Thus 

(12.2) 2T' + + AT'l - 2T'_ - AT'l - 2D - I = 0. 

Now change the orientation of the curve. The numbers 7± and T± will interchange 
and the other number involved in formula (12.2) will remain the same. Therefore 

(12.3) 2T' + + 47^' - 2T'_ - 4T m - 2D - I = 0. 
It remains to add (12.2) and (12.3) and to divide by 4: 

T' + + T'l + T"' -T'_- T" - T m -D-\l = Q, 
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which is the same as (12.1). □ 




Figure 12.7. Bookkeeping of double tangent lines 

Relation (12.1) is a necessary condition on T±, I, D to be the numbers of outer 
and inner tangents, inflections and double points of a closed plane curve. Is it also 
sufficient? See Exercise 12.5 for a partial answer. 

A generalization of the Fabricius-Bjerre formula (12.1), due to Weiner [89], 
concerns smooth closed curves on the sphere. Of course, in this case, "lines" are 
understood as great circles. There is one more ingredient that goes into the formula, 
the number A of pairs of antipodal points of the curve. Weiner 's formula states: 

(12.4) T+ - T_ - = D - A. 

If the curve lies in a hemisphere, it has no antipodal points. One may centrally 
project the hemisphere, along with the curve, to the plane, and then (12.4) will 
coincide with (12.1). 

Remark 12.1. To the reader familiar with algebraic geometry, formula (12.1) 
resembles the Plucker formulas. These formulas concern algebraic curves in the 
projective plane; everything is considered with complex coefficients. As before, let 
T, D, I and C denote the number of double tangents, double points, inflections and 
cusps of the curve (no signs are involved when working with complex numbers). 

Two other numbers contribute to the Plucker formulas: the number of intersec- 
tions of the curve with a general line, N (the degree of the curve) and the number 
of tangent lines to the curve from a generic point, N* (the class of the curve). The 
number N is the degree of the polynomial equation that defines the curve, and N* 
is the degree of the polynomial equation that defines the projectively dual curve. 
For example, for an ellipse, N — N* = 2. 

The Plucker formulas are: 

N* = N(N - 1) - 2D - 3C, N = N*(N* - 1) - 2T - 37, 

and 

3N(N- 2) = I + 6D + 8C, 3N*(N* - 2) = C + 6T + 81. 

The formulas in each pair are interchanged by the projective duality, described in 
Lecture 8. 
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For example, for a smooth curve of degree N = 4, one has C = D = 0, and 
therefore N* — 12,1 — 24 and T = 28; this will be of critical importance in Lecture 
17. 

Note the difference between algebraic curves and smooth curves, the "doodles" 
of this lecture: the former are "rigid" objects, depending on a finite number of 
parameters, the coefficients of their polynomial equations, whereas the latter are 
extremely "soft" and can be deformed with much greater freedom. One manifesta- 
tion off this flexibility will be discussed in Section 12.4. 

12.3 Doodles with cusps: Ferrand's formula. Another, quite recent, for- 
mula for curves with cusps is due to E. Ferrand [29]. Consider a plane curve with 
an even number of cusps and color the smooth arcs between the cusps alternatively 
red and blue. We assign signs to double points: a double point is positive if it is the 
intersection of two arcs of the same color, and it is negative if it is the intersection of 
two arcs of the opposite colors. Denote by D± the number of positive and negative 
double points. 

We also redefine the signs of double tangents. There are three attributes to a 
double tangent: whether the orientations of the two arcs are the same or opposite; 
whether the arcs lie on the same or opposite sides of the tangent line; and whether 
the two arcs are of the same or opposite colors. The sign of a double tangent is 
shown in Figure 12.8. 
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Figure 12.8. The signs of double tangents 



Let T± be the number of positive and negative double tangent lines. With this 
preparation, Ferrand's generalization of Fabricius-Bjerre formula to curves with 
cusps is as follows. 

Theorem 12.2. For every generic plane curve with cusps, one has: 

(12.5) T+ - T_ - -I = D+ - D_ - -C. 

y ' 2 2 

We shall not prove Ferrand's theorem: we do not know a proof as simple as 
the one given to the Fabricius-Bjerre theorem (but the reader is welcome to try to 
find such a proof, see Exercise 12.8). Formula (12.5) is illustrated in Figure 12.9; 
the first curve has 



T + = 4, T_ = 3, 1= 6, D + = 0, £>_=!, C = 2, 
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Figure 12.9. An illustration of Ferrand's formula 

and the second 

T+ = 2, T_ = 0, 1 = 4, D+ = l, D- = 0, C= 2. 

Remark 12.2. Projective duality interchanges the numbers involved in the 
Fabricius-Bjerre and Ferrand formulas: the numbers of double tangents and of 
inflections of a curve equals the numbers of double points and of cusps of the dual 
curve - see Figures 8.7 and 8.8. 

12.4 Winding number and Whitney's theorem. The winding number of 
a closed smooth curve is the total number of turns made by the tangent vector as 
one traverses the curve. If the curve is oriented, the winding number has a sign, 
otherwise it is a non-negative integer. For example, the winding numbers of the 
curves in Figure 12.10 are equal to 1 and 3, respectively. 




FIGURE 12.10. Winding numbers 1 and 3 

Let us continuously deform a smooth curve. We do not exclude self-tangency or 
multiple self-intersections, as in Figure 12.11. In such a deformation, 1 the winding 
number remains the same. Indeed, a small perturbation of a curve leads to a small 
change in the winding number; being an integer, it must remain constant. 

The converse statement is Whitney's theorem. 

Theorem 12.3. If two closed smooth curves have the same winding numbers 
then one can be continuously deformed to the other. 

For example, the left curve in Figure 12.10 can be deformed to a circle, as the 
reader has probably already noticed. 



lr The technical term for such deformation is regular homotopy. 
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Figure 12.11. A continuous deformation of a smooth curve 




Figure 12.12. A long curve 



We shall prove Whitney's theorem for another class of curves, the long curves. 
A long curve is a smooth plane curve that, outside some disc, coincides with the 
horizontal axis, see Figure 12.12. Long curves are a little easier to work with, 
whence our choice. 




Figure 12.13. Model long curves 

Proof of Whitney's theorem for long curves. Long curves are oriented from left 
to right. A model long curve with winding number n is the horizontal line with 
\n\ consecutive kinks, clock- or counter clock-wise, depending on the sign of n, see 
Figure 12.13. We want to prove that a long curve with winding number n can be 
deformed to one of these model curves. 

Figure 12.14 features a deformation that adds (or cancels) a pair of kinks with 
opposite orientations. Using this trick, one can always add |n| such pairs to a given 
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curve so that one obtains a curve with winding number zero, followed by n kinks. 
Therefore, it suffices to prove that a long curve 7 with zero winding number can be 
deformed to the horizontal axis. 




Figure 12.14. Adding or canceling a pair of opposite kinks 

Let j(t) be a parameterization of our curve. Consider the angle a(t) made by 
the positive tangent vector to jit) with the horizontal direction. The graph of the 
function a(t) may look like Figure 12.15. 

In fact, the angle a(t) is defined only up to addition of a multiple of 2tt. We 
choose a(t) = on the left horizontal part of the curve and extend it continuously 
to an "honest" function of t. Since the winding number is zero, a(t) = on the 
right horizontal part of the curve as well. 



y 




Figure 12.15. A graph of the function a(t) 

Let us squeeze this graph toward the horizontal axis: a s (t) = sa(t) where s 
varies from 1 to 0. For every value of s, one has a unique curve 7 S whose direction 
at point J s (t) is a s (t). In particular, a n (t) = 0, hence 70 is the horizontal axis. 

What do the curves 7 S look like? They start as the horizontal axis and end as 
horizontal lines, since a(t) — for sufficiently large \t\. The only problem is that 
the right end of j s may be on a different height, see Figure 12.16. This problem 
is addressed by smoothly adjusting the curve on its horizontal right part, see the 
same figure, and one obtains a long curve 7 S . 

To wit: we have constructed a continuous family of long curves, from 7 = 71 
to 70, the horizontal axis. This is a desired deformation. □ 

Let us mention a version of Whitney's theorem for curves on the sphere. The 
result is even simpler than in the plane. A generic smooth closed spherical curve 
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Figure 12.16. Adjusting the height of the right end 



has a single invariant that assumes values or 1: this is the parity of the number 
of double points. 

Theorem 12.4. Two generic smooth spherical curves can be continuously de- 
formed to each other if and only if their numbers of double points are either both 
even or both odd. 

Sketch of Proof. That the parity of the number of double points does not change 
under a generic deformation is clear from Figure 12.11. 

Let us show that two curves with the same parity of the number of double points 
can be deformed to each other. The sphere becomes the plane after deletion of a 
point. One obtains a plane curve that, by the Whitney theorem, can be deformed 
to (the closure of) a model curve in Figure 12.13. For these curves, the number of 
double points is one less than the winding number. It remains to show that, on the 
sphere, the model curves with winding numbers that differ by 2 can be deformed 
to each other. Such a deformation (for the winding numbers and 2) is shown in 
Figure 12.17. □ 

Remark 12.3. Whitney's theorem has far reaching generalizations in which 
the circle and the plane are replaced by arbitrary smooth manifolds. This area is 
known as the Smalc-Hirsch theory. One of the most striking results of this theory is 
the sphere eversion, a deformation of the sphere in 3-dimensional space in the class 
of smooth, but possibly self-intersecting, surfaces that ends with the same sphere, 
turned inside out. A number of explicit constructions of such sphere eversions are 
known; one, due to W. Thurston, is shown in the movie "Outside in" [94] which 
we recommend to the reader. 2 

12.5 Combinatorial formulas for the winding number. To find the wind- 
ing number of a closed or long curve, one may just traverse the curve and count 
the total number of turns made. However, there are better ways of counting spins 
without getting dizzy, and we shall discuss a few in this section. 

The first formula is shown in Figure 12.18. This formula is, more-or-less, obvi- 
ous. To count the number of total turns, it suffices to count how many times the 

2 A more recent movie, "The Optiverse", features a different sphere eversion based on mini- 
mizing an elastic bending energy for surfaces in space. 
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Figure 12.17. A deformation of a spherical curve 

tangent line to the curve is horizontal. There are four possibilities shown in Figure 
12.18. The first two contribute to rotation in the positive, counter clock-wise di- 
rection, and the second two, to rotation in the negative direction. This proves the 
formula. 

— v./ + r\ - vy - r\ 

FIGURE 12.18. A formula for the winding number 

Another formula for the winding number was given by Whitney in the same 
paper where he proved the theorem discussed in the preceding section. Let us first 
describe this formula for long curves. Traverse a curve from left to right. Each 
double point is visited twice and looks as shown in Figure 12.19. Call the first a 
positive and the second a negative double point. Let D± be the number of positive 
and negative double points of the curve. The formula for the winding number is: 

(12.6) w = £>+-£>_. 




FIGURE 12.19. The signs of double points 

Proof of formula 12.6. If the curve is a model one, as in Figure 12.13, the result 
clearly holds. Since every curve can be deformed to a model one, we shall be done 
if we show that D + — D_ docs not change under deformations. 
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During a generic deformation, one may encounter two "singular" events de- 
picted in Figure 12.11. The first introduces (or eliminates) a pair of double points 
of the opposite signs, and hence does not affect D + — D_ , while the second does not 
change the number or the signs of the double points involved. This proves (12.6). 
□ 




Figure 12.20. Rotation numbers 



For a closed oriented curve 7, formula (12.6) is modified as follows. First of all, 
to assign signs to double points, one chooses a starting point x on 7. 

Let y be a point not on the curve 7. Denote by r(y) the rotation number of 
the curve about y, that is, the number of times 7 goes about y (cf. Section 6.4). In 
other words, r(y) is the number of complete turns made by the position vector yx 
as the point x traverses the curve. See for example Figure 12.20, where the rotation 
numbers are assigned to the components of the complement of the curve. 

When the point y crosses the curve, r(y) changes by 1, as in Figure 12.21. If y 
is a point on the curve (but not a double point), the rotation number r(y) is defined 
as the half-integer, equal to the average of the two values obtained by pushing y 
slightly on both sides of the curve. 



r + 1 



Figure 12.21. How the rotation number changes upon crossing 
the curve 



The formula for the winding number of a closed curve is: 

(12.7) w = D + -D_ + 2r(x). 

Still another way to find the winding number w of an oriented curve is to 
resolve each double point as shown in Figure 12.22. After this is done everywhere, 
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our curve decomposes into a collection of simple curves, some oriented clock- wise 
and some counter clock- wise. Let i_ and I + be the numbers of these simple curves. 




Figure 12.22. The resolution of a double point 
Theorem 12.5. One has: w = I + - J_, see Figure 12.23. 




Figure 12.23. Computation of the winding number by resolving a curve 

Proof. Traverse a curve 7, starting at a double point, say, x. Upon the first 
return to x, one traverses a closed curve (with corner) 71; let ot\ be the total turn 
of its tangent vector. Likewise, continuing along 7 until the second return at x, one 
traverses another closed curve, 72; let a 2 be the total turn of its tangent vector. 
Clearly the total turn of the tangent vector of 7 is ct\ + a 2 . Resolving the double 
point x, as in Figure 12.22, we make both curves smooth, adding the same amount 
(say, 7r/2) to ct\ and subtracting from a 2 . Thus the winding number of 7 is the sum 
of the winding numbers of the rounded curves 71 and 72 . Applying this argument 
to every double point yields the result. □ 

Remark 12.4. The subject of this lecture is closely related with knot theory 
(see [1, 72] for expositions). The Fabricius-Bjerre (12.1) and Ferrand (12.5) formu- 
las have natural interpretations as self-linking numbers, and this stimulated recent 
interest in them. The combinatorial formulas for the winding number in Section 
12.5 resemble some formulas for finite order knot invariants in contemporary knot 
theory. 
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12.6 Exercises. 
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12.1. Prove that the complement to a closed plane curve has a chess-board 
coloring (so that the adjacent domains have different colors). 

12.2. Prove that the number of intersection points of two closed curves is even. 

12.3. Consider two plane curves (as usual, in general position). Let t + and t_ 
be the number of their outer and inner common tangent lines and d the number of 
their intersection points (thus we are not concerned with double tangents or double 
points of cither curve). Show that t+ = t- + d. 

12.4. Draw curves with 



(a)T+ 



2,T_ 



0,7=2,1)= 1; 



(b) T+ = 3, T_ = 0, / = 2, D = 2; 

(c) T+ = 4, T_ = 2, 1 = 0, D = 2. 

12.5. * (a) If I is a positive even number and T + — T_ — 1/2 = D then there 
exists a curve with the respective number of double tangents, inflections and double 
points. 

(b) If I = 0, prove that T_ is even and T_ < (2D + 1)(£> - 1). 

(c) If T_ is even and T_ < D(D— 1) then there exists a curve without inflections 
and with the respective number of double tangents and double points. 
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Figure 12.24. A curve with cusps 

12.6. Consider a curve with cusps, such as in Figure 12.24, and extend the 
notion of double tangents to include the lines that touch the curve in cusps, see 
Figure 12.25. Let C be the number of cusps. Prove that 

T+ - T_ - ^ J = D + C. 

Hint. Round up cusps, trading each for two inflections, see Figure 12.26. 




Figure 12.25. Generalized double tangent lines 




FIGURE 12.26. Rounding up a cusp 

12.7. * Prove the Weiner formula (12.4). 

12.8. * Prove the Ferrand formula (12.5). 

12.9. Prove formula (12.7). 

Hint. Check that the right hand side of (12.7) does not depend on the choice 
of x, sec Figure 12.27. 
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Figure 12.27. The effect of changing the base point 



12.10. Prove that the rotation number r(y) of the curve about a point y can 
be computed as follows: resolving all double points as in Figure 12.22, y gets 
surrounded by a number of clock-wise and counter clock-wise oriented curves. Then 
r(y) is the number of the latter minus the number of the former. 

12.11. Show that the winding number of a closed curve is at most one greater 
than the number of its double points and has the parity, opposite to it. 

12.12. Assume that a closed curve has n double points, labeled 1 through n. 
Traverse the curve and write down the labels of the double points in the order 
they are encountered. One obtains a cyclic sequence in which each number 1, . . . ,n 
appears twice. Prove that, for any i, between two occurrences of symbol i, there is 
an even number of symbols in this sequence (theorem of Gauss). 

Hint. Resolve the i-th double point, as in Figure 12.22, and use the fact that 
the resulting two curves intersect an even number of times. 



12.13. The left Figure 12.28 shows a disc embedded in the plane, whereas the 
right one is an immersed disc which overlaps itself. Such an immersion is a smooth 
map of a disc in the plane that is locally an embedding. The boundary of an 
embedded disc is a simple closed curve; the boundary of an immersed disc may be 
much more complex. 

(a) Does the curve in Figure 12.29 bound an immersed disc? 

(b) Prove that the boundary of an immersed disc has winding number 1 . 

(c) Show that the curve in Figure 12.30 bounds two different immersed discs. 




Figure 12.28. Embedded and immersed discs 
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Figure 12.29. Does this curve bound an immersed disc? 




Figure 12.30. This curve bounds two different immersed discs 
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LECTURE 13 

Paper Sheet Geometry 

13.1 Developable surfaces: surfaces made of a sheet of paper. Take a 
sheet of paper and bend it without folding. You will have in your hands a piece of 
surface whose shape will depend on how you bend. Samples of surfaces which you 
can get are shown on Figure 13.1. 




Figure 13.1. Paper sheet surfaces 



However, not every surface can be obtained by bending a sheet of paper. Ev- 
erybody knows, for example, that it is impossible to make even a small piece of 
a sphere out of a sheet of paper: if you press a piece of paper to a globe, some 
folds will appear on your sheet. It is possible to make a cylinder or a cone, but you 
cannot bend a sheet of paper like a handkerchief without making fold lines (Figure 
13.2). 

In geometry, surfaces which can be made out of a sheet of paper, in the way 
described above are called developable. We shall not even try to make this definition 
more rigorous, but still we shall specify the two physical properties of paper which 
are essential for our geometric purposes: paper is not compressible or stretchable 
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YES YES NO 



Figure 13.2. Cylinder and cone, but not a handkerchief 

and is absolutely elastic. The first means that, after bending, all curves drawn on 
the paper retain their lengths. The second means that there are no other restrictions 
on bending paper. "Without folding" means that the surface remains smooth, which 
means, in turn, that the surface has a tangent plane at every point. 

We shall see that not all surfaces are developable from the simplest property of 
developable surfaces (this property, as well as all other major results of the theory 
of developable surfaces, was proved by Euler). 

13.2 Every developable surface is ruled. The latter means that for every 
point of a developable surface there exists a straight interval which is contained 
in the surface and contains the point in its interior. In terms of our everyday 
experience, we can say, that at every point A of the bent sheet of paper, we can 
attach a bicycle spoke to the paper in such a way that some piece of it on both sides 
from the point A will touch the paper (Figure 13.3). We shall not prove this fact 
(the proofs known to us operate with formulas rather than geometric images) and 
shall regard it as an experimental, but firmly established, property of developable 
surfaces. 




FIGURE 13.3. A straight ruling 
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If some point of a developable surfaces belongs to two different straight inter- 
vals, then some piece of the surface around this point is planar (Figure 13.4). To 
avoid this possibility we shall simply restrict our attention to surfaces which have 
no planar pieces. This assumption implies that for every point of the surface there 
is a unique line belonging to the surface and passing through this point. 



We must add that no real life sheet of paper is infinite. So, our surfaces will 
have boundaries. Every point of the surface belongs to a unique straight interval 
which starts and ends on the boundary. These straight intervals form a continuous 
family which sweeps the whole surface (Figure 13.5). 



13.3 Not only a spoke, but also a ruler. There are too many ruled sur- 
faces: a moving straight line in space sweeps one. Some ruled surfaces are well 
known: we shall discuss, in detail, the properties of two of them in Lecture 16: 
a one-sheeted hyperboloid and a hyperbolic paraboloid. Now we can state that 
developable surfaces are much rarer than ruled surfaces; in particular, the doubly 
ruled surfaces of Lecture 16 are not developable (which is seen already from the 
properties of developable surfaces listed in Section 13.2.) We are going to formulate 




FIGURE 13.4. A planar point of a developable surface 




Figure 13.5. A family of rulings 
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another experimental fact which characterizes the difference between ruled surfaces 
and developable surfaces. 




Figure 13.6. The tangent planes along a ruling revolve 

Take an arbitrary ruled surface S, take a line £ on S, and consider the tangent 
plane Ta to S at some point A of £. This plane will contain £, but the planes 
Ta will be, in general, different for different points A of £; that is, when we move 
point A along £, the plane Ta will rotate about £. For example, if S is a one-sheeted 
hyperboloid (see Figure 13.6), then Ta will contain, besides £, the line of the second 
family of lines (see Lecture 16), and hence these planes will be different for different 
points A; when A moves along £, Ta makes almost half a turn about £. Such things, 
however, never happen on a developable surface: 

All tangent planes tangent to a developable surface S at points of a straight line 
on this surface coincide. In other words, one can attach to a developable surface not 
only a (one- dimensional) bicycle spoke, but also a (two-dimensional) ruler (Figure 
13.7). 




Figure 13.7. The tangent planes along a ruling are the same 
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This criterion (which we also do not prove) provides not only a necessary, but 
also a sufficient condition for a ruled surface to be developable. 

13.4 Let us expand the lines on a developable surface. Look again at 
Figure 13.5. Since our surface S is not infinite (a sheet of paper cannot be infinite!) 
the straight lines on S are not infinite: they begin and end on the boundary of the 
surface. Let us expand these lines, in one of the two possible directions. What will 
happen? 



Figure 13.8. Expanding the rulings upwards 

This question seems innocent, at the first glance. Let us expand the lines shown 
on Figure 13.5 upwards, in the direction where they diverge. We see that nothing 
extraordinary will happen: the surface will grow, eventually becoming less and less 
curved, more and more resembling a plane (Figure 13.8). 




Figure 13.9. Expanding the rulings downwards 

But what if we expand the lines in the opposite direction (Figure 13.9)? The 
reader can pause here and think of this question. The lines converge, but they 
do not, in general, come to one point, we can expect that they are pairwise skew. 
We can expect that they will first converge and then diverge, forming a surface 
resembling a hyperboloid - but the hyperboloid is not a developable surface, so this 
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is also not likely. It is not easy to guess what will happen. And what happens is 
the following: 




Figure 13.10. Cuspidal edge 

The surface will not be smooth, it will gain a cuspidal edge. This is a curve 
such that the section of the surface by a plane, perpendicular to this curve, looks 
like a semi-cubic parabola (Figure 13.10). Moreover, all the lines on the surface S 
will be tangent to this curve. 

13.5 Why a cuspidal edge? Let us try, if not to prove this statement, then, 
at least, to explain, why it should hold. Let us try to make an actual drawing of 
the expanded lines on our Figure 13.5 (see Figure 13.11). You can see the cuspidal 
edge on Figure 13.11 with your own eyes! 




Figure 13.11. The envelope of the rulings 

But no, this is not convincing. The drawing of a hyperboloid (Figure 13.12) 
looks precisely the same: there is a curve on the drawing (the side hyperbola) to 
which all the lines on the hyperboloid are visibly tangent. 



LECTURE 13. PAPER SHEET GEOMETRY 191 



Figure 13.12. The tangent plane of the hyperboloid is perpen- 
dicular to the drawing 

We say "visibly," since the lines which we see on the drawing are the projections 
of the lines on the surface onto a planar page of the book. No tangency occurs on 
the surface; it is simply that the tangent planes to the hyperboloid at the point 
corresponding to the points of the side hyperbola are perpendicular to our drawing 
and their projections are lines. But this is not possible on a developable surface - 
because of the rule formulated in Section 13.3. Indeed, the tangent plane to the 
surface at the points of our line, not belonging to the proposed cuspidal edge, are 
not perpendicular to the plane of the drawing. But the tangent plane is the same 
at all points of the line (remember our ruler rule?); hence it is not perpendicular 
to the drawing at the points of tangency to the edge. 




Figure 13.13. The tangent plane of a developable surface is not 
perpendicular to the drawing 

Thus, the tangent plane looks like it is shown on Figure 13.13, which shows 
that the curve, the cuspidal edge, is indeed tangent to the straight lines on the 
surface. 

13.6 Backward construction: from the cuspidal edge to a developable 
surface. Since our surface consists of lines tangent to the cuspidal edge, we can 
look at our construction from the opposite end. Let us begin with a space curve 
(which should be nowhere planar). Take all the tangent lines to our curve; they 
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sweep out a surface. This surface is a developable surface, and the initial curve is 
its cuspidal edge. What is really surprising, is that an arbitrary nowhere planar 
developable surface (including the sheet of paper which you are holding in your 
hands) can be obtained in this way. 

Well, not quite arbitrary. There are two exceptional, degenerate, cases. Your 
surface may be cylindrical which means that all the lines on it are parallel to 
each other; it has no cuspidal edge (one can say that its cuspidal edge escapes to 
infinity). Or it can be conical which means that all the lines pass through one 
point (one can say in this case that the cuspidal edge collapses to a point). But 
a "generic," randomly bent sheet of paper always consists of tangent lines to an 
invisible cuspidal edge (invisible, because it always lies not on the sheet, but on the 
expanded surface). 

It is no less surprising that an arbitrary non-planar curve is a cuspidal edge of 
the surface formed by its tangent lines. For an illustration, the reader may look at 
the picture of the surface formed by tangent lines to the most usual helix (Figure 



Figure 13.14. This surface is made of the tangent lines to a helix 

A handy reader may even make a model of this surface of a helical piece of 
wire and a bunch of bicycle spokes. The spokes should be attached to the wire as 
tangents to the curve. 

13.7 Is the cuspidal edge smooth? Is this all that one can say? Actually, 
no, as we shall see in a moment. Magnify, mentally, your surface to such a size that 
you can walk on it, and then walk across the straight lines on the surface. Since the 
lines are tangent to the cuspidal edge, the distance from you to the tangency point 
will rapidly decrease or rapidly increase, and both are possible. What happens at 
the moment of a transition from one mode to the other one? Figure 13.15 presents 
a sheet of paper with straight lines on it and two segments of the cuspidal edge. 
And what is between them? A smooth curve like the flash insert on this drawing? 



13.14). 
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? 



Figure 13.15. Is the cuspidal edge everywhere smooth? 

No, this curve cannot be tangent to all straight lines. Thus, only one possibility 
remains: 

The cuspidal edge itself must have cusps at some points (see Figure 13.16). 




Figure 13.16. The cuspidal edge has a cusp 

Let us try to understand what the surface looks like in the proximity of these 
incredible points. 

13.8 The swallow tail. Let us begin with a picture. The surface shown on 
Figure 13.17 is called a swallow tail (we let the reader judge how much is resembles 
the actual tail of a swallow). Besides the cuspidal edge it also has a curve of self- 
intersection. The left hand side drawing shows a family of straight lines on the 
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Figure 13.17. The swallow tail: straight rulings and plane sections 

surface; the right hand side drawing (of the same surface) presents several relevant 
plane sections. 

To convince ourselves that the surface indeed looks as shown on Figure 13.17, 
let us act as in Section 13.6: start with a cuspidal edge and construct a surface as 
the union of the tangent lines. 

The "typical" space curve with a cusp can be obtained from a (planar) semi- 
cubic parabola by slightly bending its plane. This curve may be described, in a 
rectangular coordinate system, by parametric equations x — at 2 , y — bt 3 , z = ct 4 . 
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Look at this curve from above (so you will see a semi-cubic parabola) and draw 
its tangent lines. Break every line into three parts by the tangency point and the 
intersection point with the line of symmetry of the semi-cubic parabola as shown 
on Figure 13.18. 



Then draw separately the first, the second, and the third parts of the tangent 
lines (Figure 13.19, a-c); these are three parts of the swallow tail. Figure 13.19 a 
shows the piece of the surface between the two branches of the cuspidal edge; it is 
slightly concave up. Figure 13.19 b presents the two pieces of the swallow tail and 
the self-intersection curve, and Figure 13.19 c shows the rest of the surface. Notice 
that the parts of the surface presented on Figures 13.19 b and 13.19 c have edges 
along the self-intersection curve, and that this curve is a half of a usual planar 
parabola. 



Thus a surface obtained by a most natural expansion of a randomly bent sheet 
of paper has a cuspidal edge with cusps and looks like a swallow tail in a neighbor- 
hood of the cusps of the cuspidal edge. This is the answer to an innocently looking 
question which we asked in the beginning of Section 13.4. 

13.9 There are swallow tails all around. You may remember that in an- 
other part of this book (Lecture 9) we were trying to convince the reader that there 
are cusps all around us. This was true for planar geometry; in space, we have to 
admit that there are swallow tails all around. The spatial constructions similar 
to those of Lecture 9, including fronts of surfaces and visible contours of four- 
dimensional bodies, lead to surfaces with swallow tails. For instance, if you take 
a surface looking like an ellipsoid (for example, an ellipsoid) and then move every 




Figure 13.18. Tangent line to a semi-cubic parabola 




Figure 13.19. Three parts of the swallow tail 
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point along a normal line inside the ellipsoid, then, at some moment, the moving 
surface will acquire cuspidal edges, self-intersections and swallow tails which will 
pass through each other and finally disappear. 

But, historically, the first picture of the swallow tail (not the name: it was 
given to the surface in the 1960s by Rene Thorn) appeared in the middle of 19-th 
century in algebra books. We discussed this aspect of the swallow tail in Lecture 
8. Recall that if we are interested in the number of (real) solutions of the equation 

(13.1) x 4 + px 2 + qx + r = 0, 

then we need to consider a swallow tail in space with coordinates p, q, r (the cuspidal 
edge of this swallow tail should be p — — 6i 2 , q = 8i 3 , r = — 3i 4 ). 

Inside the triangular pocket of the swallow tail, there will be points (p, q, r) 
for which the equation (13.1) will have 4 real solutions. Above the surface there 
will be points (p, q, r) corresponding to the equations with 2 real (and 2 complex 
conjugated) solutions. Below the surface, there will not be real solutions at all. On 
the surface, besides the boundary of the pocket (in other words, on the part of the 
surface corresponding to Figure 13.19 c), there will be one real solution (repeated 
twice) and a pair of complex conjugated solutions. On the boundary of the pocket, 
there will be 3 real solutions: two simple and one repeated; the difference between 
the top part of this boundary (Figure 13.19 a) and the side parts (Figure 13.19 b) 
is in the order of the solutions: on the top part, the repeated root lies between the 
simple roots, on the two side parts it is, respectively, less than each simple roots 
and greater than each simple root. On the cuspidal edge, there are two roots: one 
triple and one simple; the two branches of the cuspidal edge distinguish between 
the two possible inequalities between these roots. On the self-intersection curve, 
there are two pairs of double roots (by the way, the second half of the parabola of 
self-intersection lies in the "no real roots" domain; it corresponds to equations with 
repeated complex conjugated roots). Finally, the most singular point, the cusp of 
the cuspidal edge, corresponds to the equation x A = with four equal roots. 

Note that the first picture of the swallow tail looks very different from Figure 
13.17 (see Figure 8.11 in Lecture 8). 
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13.10 Exercises. For solving the exercises below the reader may use all the- 
orems provided in this lecture, with or without complete proofs. 

Let 7 = {x — x(t),y — y(t), z — z(t)} be a curve and P = (x(t ),y(t ), z(to) be 
a non-inflection point (which means that the velocity vector 7' (to) = (x'(to), y'(to), z'(, 
and the acceleration vector 7" (to) — (x" (to) , y" (to) , z" (to)) are not collincar. The 
plane spanned by these two vectors at P is called the osculating plane of 7 at the 
point P. 
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13.1. Prove that if a plane II contains the tangent line to the curve 7 at P and 
in any neighborhood of P the curve 7 does not lie at one side of II, then II is the 
osculating plane. 

13.2. Prove that the tangent planes of a generic (non-planar, non-cylindrical 
and non-conical) developable surface are osculating planes to the cuspidal edge, and 
vice versa. (There are tangent planes to the surface passing through cusps of the 
cuspidal edge; these planes may be regarded as osculating planes of the cuspidal 
edge, although this case is not covered by the definition above.) 

13.3. Prove that a generic family of planes in space is the family of tangent 
planes to a developable surface, and hence, by Exercise 13.2, also a family of oscu- 
lating planes to a curve. 

Comment. Thus, a family of planes has two "envelopes" : a developable surface 
and a curve; the latter is the cuspidal edge of the former. 

13.4. (Exercise 13.3 in formulas.) 

(a) Let 

A{t)x + B(t)y + C{t)z + D(t) = 
(where t is a parameter) be a family of planes. Prove that to get parametric 
equations of the enveloping developable surface, one needs to take, as parameters, 
t and one of the coordinates and then to solve the system 

f A{t)x + B{t)y + C{t)z + D{t) = 
\ A / (t)x + B / (t)y + C'(t)z + D / (t) = 

with respect to the remaining two coordinates. To get parametric equations of the 
enveloping curve, one needs to solve with respect to x, y, z the system 

A(t)x + B{t)y + C{t)z + D(t) = 
A'(t)x + B'{t)y + C'{t)z + D'(t) = 
A"{t)x + B"(t)y + C"{t)z + D"{t) = 

(b) Apply these formulas to the family of planes obtained from the plane x+z = 
by rotating about the z axis with simultaneous parallel shift in the direction of 
the same axis: 

x cos t — ysint + z — t = 0. 

13.5. Take a spatial curve with an inflection point, x — t,y — t 3 , z — t , and 
consider the developable surface formed by tangents to this curve. Investigate all 
singularities (cuspidal edges and self-intersections) of this surface. 

13.6. * (a) Consider a ruled developable disc D and let 7 be a smooth closed 
curve on D. Prove that there are two points of 7 that lie on the same ruling of D 
and such that the tangent lines to 7 at these points are parallel. 

(b) Construct a developable disc and a smooth closed curve on it that has no 
parallel tangents. 




LECTURE 14 

Paper Mobius Band 

14.1 Introduction: it is not about ants or scissors. The Mobius band is 
an immensely popular geometrical object. Even small children can make it: take 
a paper strip, twist it through 180 degrees (by half a turn), and then attach the 
ends to each other by glue or tape. By the way, one of us is still grateful to his 
analysis professor who taught his students how to draw a Mobius band: draw a 
standard trefoil, then add the three double tangents, then erase three segments of 
the curve between self-intersections and tangency points (Figure 14.1). You can see 
the drawing, it is beautiful. 




Figure 14.1. How to draw a Mobius band 

There is a number of familiar tricks involving the Mobius band. You can cut 
it along the middle circle, and - check for yourself what happens. Or you can let 
a stupid ant crawl from one side to another without crossing the border. But we 
shall consider a totally different problem: if it is so easy to make a Mobius band out 
of a paper strip, then what shape of strip should one take? More precisely: there 
should be a real number A such that with a rectangular strip of paper of width 1 
and length i one can make a Mobius band for I > A, but it is impossible if I < A. 

Question: what is A? 

Answer: not known. 

We could have stopped here, but we shall not. Let us discuss, what is known 
about this problem, and what the perspectives are. 
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14.2 Do not fold paper. Our readers who are familiar with Lecture 13 on 
paper sheet geometry know that the condition of smoothness is crucial in problems 
of this kind. Indeed, if it is allowed to, say, fold the paper, then a Mobius band can 
be made of an arbitrary paper rectangle, even when its width exceeds its length. 
How to do it, is shown in Figure 14.2: take a rectangular piece of paper (of any 
dimensions), pleat it, then twist and glue. The condition of smoothness of our 
surface, that is, in more mathematical terms, of the existence of a unique tangent 
plane at every point of the surface, should play a role in our problem. 




Figure 14.2. Making a Mobius band of folded paper 
Now we are prepared to formulate our main result. 

14.3 Main Theorem. Let A be a real number such that a smooth Mobius 
band can be made of a paper rectangle of dimensions 1 x I if I > A and cannot, if 
£ < X. 

Theorem 14.1. ^ < A < V3. 
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Thus, the interval between it/2 w 1.57 and \/3 w 1.73 remains a grey zone for 
our problem. Later, we shall discuss the situation within this zone, but first let us 
prove this theorem. 

We shall need some general properties of surfaces made of paper. 




Figure 14.3. A ruling of a paper surface 

14.4 Surfaces made of paper. We discussed these properties in Lecture 13 
mentioned above. Not every surface can be made of paper. Restriction come from 
physical properties of (ideal) paper: it is flexible but not stretchable. The latter 
means that any curve drawn on a sheet of paper retains its length after we bend the 
sheet into a surface. As we know from Lecture 13, any paper surface is ruled which 
means that every point belongs to a straight interval lying on the surface. The line 
containing this interval is unique, unless a piece of the surface around the chosen 
point is planar. Thus, any paper surface consists of planar domains and straight 
intervals. If we draw these intervals and shadow these domains on the surface and 
then unfold the surface into a planar sheet, we will get a picture like the one shown 
in Figure 14.3. 

14.5 Proof of the inequality A > \. Let a Mobius band be made of a paper 
strip of width 1 and length I. If we take a very long (infinite) strip of width 1, we 
can wind it onto our Mobius band, so that every rectangle of length £ will assume 
the shape of our Mobius band (and these rectangles will be located alternatively 
on two sides of the core band). Mark on the strip the straight intervals and the 
planar domains (the latter will have the shape of trapezoids which may degenerate 
into triangles, they are shaded in the upper Figure 14.4). The picture is periodic: 
it repeats itself with period 2£, and the successive rectangles of length t repeat each 
other but turning upside down. We fill the planar domains with straight intervals, 
so that the whole strip will be covered by a continuous family of intervals (pairwise 
disjoint) with the same periodicity property as above (see lower Figure 14.4). All 
the intervals have lengths > 1, their ends lie on the boundary lines of the strip, and 
all of them remain straight when we bend our strip into a Mobius band. 
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Figure 14.4. Filing planar domains on the strip with straight segments 



A C A 1 




Figure 14.5. AB and CD. 

Take any interval of our family, say, AB. Shift it to the right by I to the position 
AB' and then reflect the interval AB' in the middle line of the strip (Figure 14.5). 
The resulting interval CD also belongs to our family (because of the periodicity 
properties described above). 

Two things are obvious. First, AC + BD = 2£; second, on the Mobius band the 
point C coincides with B and the point D coincides with A. The second statement 
means that on our paper model the angle between the intervals AB and CD is 180°. 
Thus, in space, the intervals of our family between AB and CD form the angle with 
the interval AB which continuously varies from to 180°. Take some (big) number 
n and choose intervals A B = AB, AiBi, . . . , A„_i£?„_i, A n B n = CD of our 
family (Figure 14.6) such that the angle between AB and A k B k (on the Mobius 
180° 

band) is equal to k ■ (for k = 0,1, . . . ,n — 1). This implies that the angle 

n 

180° 

between A k B k and Ak+iBk+i is at least . 

n 

Lemma 14.1. Let a n be the side of the regular n-gon inscribed into a circle of 
diameter 1. Then (on our paper strip) AkAk+i + BkBk+i > a n (for any k). 
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A = AqAx-- ■ A n = C 




B = B B 1 ■■■ B n = D 

Figure 14.6. The family A k B k . 

Proof. Consider a piece of our paper Mobius band containing the (images of) 
the intervals A k B k and A k+ iB k+ \. The lengths of segments AkAk+i, B k B k+ i in 
space do not exceed the length of the same intervals on the strip (the latters are 
equal to the lengths of arcs A k A k+ i, BkBk+i on the boundary curve of the Mobius 
band). So, it is sufficient to prove our inequality for the points A k , A k+ i, B k , B k+ i in 
space. Take point E such that A k E has the same length and direction as A k+ \B k+ \ 
(see Figure 14.7, left). Then B k+ iE = A k+ iA k and B k E < B k B k+ i + B k+ iE = 
A k A k+ i + B k B k+1 . 

But B k E > a n . To prove this, consider an isosceles triangle KLM inscribed 
into a circle of diameter 1 whose base LM is a side of a regular n-gon inscribed into 
the same circle and which contains the center of the circle (Figure 14.7, right). In 
this triangle, ZMKL = 180°/n and KL = KM = b n < 1. In the triangle A k B k E, 
denote by F and G the points on the sides A k B k and A k E at distance b n from A k 
(these points exist since A k B k > 1 > b n and A k E = A k+ iB k+ i > 1 > b n ). Then 
B k E > FH > FG > a n (the latter is true since ZB k A k E > 180° /n). □ 




Figure 14.7. Proof of Lemma 14.1 
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Back to the inequality A > — . We have: 

2A > 21 = AC + BD 

= (A A! + ■■■ + ) + {BqB\ + • • • + B n _iB n ) 

= {A A! + B Q Bi) + ■■■ + {A n ^A n + B n _iB„) > na n 

Since this is true for every n, and na n approaches n when n grows, we have 2A > ir. 
□ 

14.6 Proof of the inequality A < y/3. To prove this inequality, it is sufficient 
to show how a Mobius band can be made of a 1 x t strip for an arbitrary I > 
We shall show how a Mobius band can be made of a strip of length precisely y/3, 
but this will require several folds. We have a commitment to avoid folds, but it is 
clear that disjoint folds can be smoothened at the expense of an arbitrarily slight 
elongation of the strip (sec Figure 14.8). 



Figure 14.8. Rounding folds. 

The construction is shown in Figure 14.9: we take a rectangle ABCD with 
AB = 1, AD = V3, draw equilateral triangles AKL and KLC with K on BC 
and L on AD. Notice that the right triangles ABK and CDL are two halves of 
one more equilateral triangle. (This construction is possible, since the side of an 

2 2 1 

equilateral triangle with altitude 1 is 2 tan 30° = -V3, and \/3 = -V3+ -V^). 

o o o 

Then we fold the strip along the lines AK, KL, and LC, as shown in Figure 14.9. 
□ 

Notice that a "Mobius band" we constructed does not look like a Mobius band. 
It is rather the union of three identical equilateral paper triangles AKL, the top 
one attached to the middle one along the side AL, the middle one attached to the 
bottom one along the side KL and the top one and the bottom one attached to 
each other along the side AK. If we take a strip slightly longer than \/3 and round 
the folds, we shall get a smooth Mobius band which will still look more like an 
equilateral triangle than a Mobius band. 
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A L D A LA L A,C L 

Figure 14.9. Constructing a Mobius band of a rectangle 1 x y/3. 



14.7 Why is a more precise value of A not known? Until a problem is 
solved, it is difficult to say why it is not solved. Still sometimes it is possible to 
detect common difficulties in different unsolved problems, which, in turn, may help 
to predict success or failure in solving some problems, or even to guess a solution. 
In previous sections we proved that A is a point of the segment [ir/2, V3] . Which 
point? Is there, at least, a plausible conjecture? Yes: we think that A = V3 and 
we are not surprised that a proof has not been found yet. 

To justify this, let us notice that our proof of the inequality A > ir/2 does not 
use one important property of a paper Mobius band: it has no self-intersections. 
One cannot make a self- intersecting Mobius band of a real-life paper sheet, but it 
is not hard to imagine it: like a self-intersecting curve, it passes "through itself" 
but consists of non-self-intersecting pieces. 

Suppose that, from the very beginning, speaking of a paper Mobius band, we do 
not exclude the possibility of self-intersections. Then the number A acquires a new 
sense, and the new value of A will be less than or equal to the old one. Moreover, 
the inequality A > it/ 2 will remain valid, and we shall not need to change a single 
word in its proof: the absence of self-intersections is not used in it at all. As to the 
inequality A < y/3, it may be considerably improved. 

Theorem 14.2. A smooth self-intersecting Mobius band can be made of a paper 
rectangle 1 x £ for any £ > ir/2. 

Proof. Take an arbitrarily big odd n, and consider a regular n-gon such that 
the distance from a vertex to the opposite side equals 1. Let p n be the perimeter 
of this n-gon; it is clear that when n grows, the n-gon becomes indistinguishable 
from a circle of diameter 1 and p n approaches 7r. 

Pn 

Take a rectangle ABCD of dimensions 1 x and inscribe in it n — 1 isosceles 

triangles AKQ, KQL, . . . , MNC, equal to the triangle formed by a side and two 
of the longest diagonals of our regular n-gon (see Figure 14.10 where n = 7). The 
triangles ABK and NCD are two halves of such triangle. Then fold the rectangle 
along the lines AK, KQ, . . . , NC (in alternating directions). The process of this 
folding is shown in Figure 14.10. 

In the end we shall obtain a paper figure, indistinguishable from a regular n-gon 
(a regular heptagon in our picture), with the segments AB and CD almost merging 
together: they will be separated only by several layers of folded paper. If we make 
the folds smooth (this will require a slightly longer strip) and then connect AB with 
CD by a very short paper strip (which will create severe self-intersections), we shall 
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FIGURE 14.10. A hcptagonal model of a self-intersecting Mobius band. 

get a smooth self-intersecting Mobius band with the ratio between the length and 
the width of the strip arbitrarily close to 7r/2. □ 

Thus, if we want to prove that A > tt/2, our proof should have to use the absence 
of self-intersections. The question whether a surface has self-intersections, belongs 
to three-dimensional "position geometry" . The whole experience of mathematics 
shows that this part of geometry is especially hard: there are almost no technical 
means to approach its problems. Thus, if an improvement of the inequality A > n/2 
exists, it is difficult to find a proof. On the contrary, an improvement of the 
inequality A < ^/Z would have involved a construction better than that of Section 
14.5. But one can expect this construction to be natural and beautiful; the fact 
that we do not know it may be regarded as an indication that it does not exists. 
By this reason it seems plausible to us that A = y/3, but the proof is hardly easy. 

14.8 Exercise. Suppose that we have a paper cylinder, made of a paper strip 
of dimensions 1 x 1 Is it possible to turn it inside out (without violating its 
smoothness)? Clearly, if the cylinder is short and wide [i is big), then it is possible, 
but if the cylinder is long and narrow (I is small), it is impossible. Where is 
the boundary between short and wide cylinders and long and narrow ones? The 
following statements due to B. Halpern and K Weaver [41] give a partial answer to 
this problem. (Nothing else is known, so far.) 

14.1. * (a) If I > it + 2, then it is possible to turn the cylinder inside out. 

(b) If t > 7r, then the cylinder can be turned inside out with self-intersections. 

(c) If t < 7r, then the cylinder cannot be turned inside out, with or without 
self-intersections. 




LECTURE 15 

More on Paper Folding 

15.1 The fold line is straight. Take a sheet of paper and fold it: the fold 
line is straight, see Figure 15.1. We start our discussion of paper folding with a 
mathematical explanation of this phenomenon. 




Figure 15.1. Folding a sheet of paper yields a straight line 

The model for a paper sheet is a piece of the plane. The fold curve partitions the 
plane into two parts. Performing folding, we establish a one-to-one correspondence 
between these parts, and this correspondence is an isometry: the distances between 
points do not change. The last property means that paper is not stretchable; this 
is our standing assumption, made in Lecture 13. 

7+ 
7 
7- 




FiGURE 15.2. Proving that the fold line cannot be curved 

207 



208 



LECTURE 15. MORE ON PAPER FOLDING 



Call the fold line 7, and let us prove that it is straight. If not, 7 has a sub-arc 
with non-zero curvature. Let 7 + be the curve 7 translated (small) distance e from 
7 on the concave side, and 7_ - that on the convex side. Then 

length 7+ > length 7 > length 7_ , 

sec Figure 15.2 (the difference is of order e ■ length 7 • curvature 7). On the other 
hand, the isometry takes 7+ to 7_ , so length 7+ and length 7_ must be equal. This 
is a contradiction. 

15.2 And still, the fold line can be curved. In spite of what has just been 
said, one can fold paper along an arbitrary smooth curve! The reader is invited to 
try an experiment: draw a curve on a sheet of paper and slightly fold the paper 
along the curve. 1 The result is shown in Figure 15.3 on the left. 




Figure 15.3. A sheet of paper folded along a curve 

One may even start with a closed curve drawn on paper. To be able to fold, 
one must cut a hole inside the curve, see Figure 15.4. 

It goes without saying that there is no contradiction to the argument in Section 
15.1: the two sheets in Figure 15.3, left, do not come close to each other, they meet 
at a non-zero angle (varying from point to point). 

To fix terminology, call the curve drawn on paper the fold and the curve in 
space, obtained as the result of folding, the ridge. Experiments with paper reveal 
the following: 

(1) It is possible to start with an arbitrary smooth fold and obtain an arbitrary 
ridge, provided the ridge is "more curved" than the fold. 

(2) At every point of the ridge, the two sheets of the folded paper make equal 
angles with the osculating plane 2 of the ridge. 

(3) If the fold has an inflection point (where it looks like a cubic parabola) 
then the corresponding point of the ridge is also an inflection point, that 
is, has zero curvature. 

1 A word of practical advice: press hard when drawing the curve. It also helps to cut a neigh- 
borhood of the curve not to mess with too large a sheet. A more serious reason for restricting to 
a neighborhood is that this way one avoids self-intersections of the sheets, unavoidable otherwise. 

2 See Section 15.3 for definition. 
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Figure 15.4. Closed fold line 



(4) If the fold is closed and strictly convex then the ridge cannot be a planar 
curve. 

In the next sections we shall explain these experimental observations. 

15.3 Geometry of space curves. We need to say a few words about curva- 
ture of plane and space curves. 

Let 7 be a smooth plane curve. To define curvature, give the curve arc length 
parameterization, j(t). Then the velocity vector, j'(t), is unit, and the acceleration 
vector 7"(i) is always orthogonal to the curve. The magnitude of the acceleration, 
|7"(i)|, is the curvature of the curve. That is, the curvature is the rate of change 
of the direction of the curve per unit of length. 

Equivalcntly, one may consider the osculating circle of the curve at a given 
point; this is the circle through three infinitcsimally close points of the curve (see 
Lecture 10). Curvature is the reciprocal to the radius of the osculating circle. 

Still another way to measure curvature is as follows. Let every point of the curve 
move, with unit speed, in the direction, orthogonal to the curve (cf. Lecture 9). 
In this process, the length of the curve changes. The absolute value of the relative 
rate of change of the length at a point equals the curvature of the curve (this is easy 
to check for a circle; for an arbitrary curve, approximate by its osculating circle). 
This characterization of curvature was used in the argument at the end of Section 
15.1. 

We now turn to curves in space. Let j(t) be an arc length parameterized 
spatial curve. Similarly to the planar case, its curvature is the magnitude of the 
acceleration vector, |7"(i)|. 

Note the following important difference with the planar case. A typical plane 
curve has inflection points (points of zero curvature) where it looks like Figure 15.5. 
The word "typical" means that if one perturbs the curve slightly then the inflection 
point will move a little but will not disappear. In space, typical curves have no 
points of zero curvature. 



210 



LECTURE 15. MORE ON PAPER FOLDING 



Figure 15.5. A typical plane curve has inflections 

(An accurate proof of this claim is rather tedious, but here is a plausible expla- 
nation. The acceleration vector 7" (i) is orthogonal to the curve and has two degrees 
of freedom. For this vector to vanish, two independent conditions must hold. But 
a point of a curve has only one degree of freedom, so we have more equations than 
variables, and hence a typical curve has no points of zero curvature.) 

Suppose our space curve has no points of zero curvature. The plane spanned by 
the velocity and acceleration vectors Y(i) and 7"(t) is called the osculating plane 
of the curve. This plane approximates the curve at point j(t) better than any other 
plane: up to infinitesimals of second order, the curve lies in its osculating plane. 
Equivalently, the osculating plane is the plane through three infinitcsimally close 
points of the curve. 

The unit vector, orthogonal to the osculating plane, is called the binormal. The 
binormal vector changes from point to point, and the magnitude of its derivative 
(with respect to arc length parameter) is called torsion. Torsion measures how the 
osculating plane rotates along the curve. 

Suppose that an arc length parameterized curve 7(f) lies on a surface M. The 
acceleration vector ~/"(t) can be decomposed into two components: the component 
orthogonal to M and the tangential one. The magnitude of the latter is called the 
geodesic curvature of the curve (cf. Lecture 20); it can be again interpreted as the 
relative rate of change of the length as every point of 7 moves on M, with unit 
speed, in the direction perpendicular to the curve. 

15.4 Explaining paper folding experiments. Recall that our mathemat- 
ical models for paper sheets are developable surfaces. Extend the two sheets of 
the developable surfaces in Figure 15.3, left, beyond their intersection curve, the 
ridge, as in Figure 15.3, right. One sees two developable surfaces intersecting along 
a space curve 7. Unfolding cither of the surfaces to the plane transforms 7 to the 
same plane curve 5, the fold. Reverse the situation and pose the following question: 
given a plane curve S, a space curve 7 and an isometry (distance preserving corre- 
spondence) / between S and 7, is it possible to extend / to a planar neighborhood 
of S to obtain a developable surface, containing 7? Said differently, can one bend a 
sheet of paper, with a curve 5 drawn on it, so that 5 bends to a given space curve 
7? 

Theorem 15.1. Assume that for every point x of 5 the curvature of 7 at the 
respective point f(x) is greater than the curvature of S at x. Then there exist ex- 
actly two extensions of f to a plane neighborhood of 5 yielding developable surfaces, 
containing 7. 

Proof. Parametrize the curves 7 and S by an arc length parameter t so that 
"f{t) = f(S(t)). Let the desired developable surface M make the angle a(t) with the 
osculating plane of the curve 7(f) (well defined since, by assumption, the curvature 
of 7 never vanishes). Denote by n{t) the curvature of the space curve 7 and by k(t) 
that of the plane curve S. 
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The magnitude of the curvature vector of 7 is k, and its projection on M has 
magnitude n(t) cos a(t); thus the geodesic curvature of 7 equals n(t) cos a(t). The 
geodesic curvature of a curve on a surface depends only on the inner geometry 
of the surface and does not change when this surface is bent without stretching. 
Therefore the geodesic curvature of 7 equals the curvature of the plane curve 5: 

(15.1) K(t) cos a(t) = k(t). 

This equation uniquely determines the function a(t). Since k < k, the angle a 
never vanishes. 

To construct the developable surface M from the function a(t), consider a one- 
parameter family of planes through points j(t), containing the tangent vector j'(t) 
and making angle a(t) with the osculating plane of 7 (there are two such planes, 
see Figure 15.6). According to the discussion of developable surfaces in Lecture 13, 
a one-parameter family of planes envelop a developable surface, and we obtain our 
two surfaces through the curve 7. □ 




Figure 15.6. Construction of the developable surface from the 
function a(t) 

The two developable surfaces of Theorem 15.1 are the sheets, intersecting along 
the ridge in Figure 15.3. Extending the sheets beyond the ridge one obtains another 
configuration of sheets that meet along the curve 7. Thus there are exactly two 
ways to fold paper along curve S to produce the space curve 7. This explains and 
extends the first observation made in Section 15.2. 

In the particular case when 7 is a planar curve, one of the sheets is obtained 
from another by reflection in this plane. In the general case of a nonplanar curve 7, 
the tangent planes of the two sheets are symmetric with respect to the osculating 
plane of 7 at every point: indeed, the angles between the osculating plane and the 
two sheets are equal to a. This justifies the second observation in Section 15.2. 

Proceed to the third observation. Let 8(t ) be an inflection point where the 
fold looks like a cubic parabola. Thus fc(io) = 0, and the curvature does not 
vanish immediately before and after the inflection point. According to formula 
(15.1), either a(io) = tt/2 or n(t ) = 0. We want to show that, indeed, the latter 
possibility holds. 

Suppose not; then both sheets are perpendicular to the osculating plane of 7 
at point 7(to)> an d therefore their tangent planes coincide. Moreover, if «(to) 7^ 
then the projection of the curvature vector of the space curve 7 onto each sheet 
is the vector of the geodesic curvature therein. This vector lies on one side of 7 
on the surface at points -f(t — e) just before the inflection point and on the other 
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side at points j(to + e) just after it. Therefore the function a(t) — tt/2 changes 
sign at t = to. This means that the two sheets pass through each other at t = t a . 
Impossible for real paper, this implies that K.(t ) = 0, that is, the ridge has an 
inflection point. 

Now, to the fourth observation. Assume that both the ridge 7 and the fold 
5 are closed planar curves and S is strictly convex. The relation (15.1) between 
the curvatures still holds: ncosa = k, and k does not vanish anywhere. Hence k 
does not vanish either, and 7 is a convex planar curve. In addition, n{t) > k(t) 
for all t and J n(t) dt > J k(t) dt since a(t) does not vanish. On the other hand, 
the integral curvature of a simple closed planar curve equals 2tt, see Exercise 15.1. 
Therefore the two integrals must be equal, a contradiction. 

15.5 More formulas and further observations. According to Theorem 
15.1, the fold S and the ridge 7 determine the developable surface, the result of 
extending the isometry between 6 and 7 to a neighborhood of S. Recall from 
Lecture 13 that developable surfaces are ruled. Denote by @(t) the angle made by 
the rulings with 7(f). 

One should be able to express the angles (3(t) in terms of geometric character- 
istics of the fold and the ridge. Indeed, such a formula exists: 

a'(t)-r(t) 



(15.2) 



cot P(t) 



n(t) sin a(t) ' 

where r is the torsion of the curve 7 and a is the angle between the surface and the 
osculating plane of the curve 7, given by equation (15.1). We do not deduce this 
formula here: this is a relatively straightforward exercise on the Frenet formulas 
in differential geometry of space curves; if the reader is familiar with the Frenet 
formulas, he will do this as Exercise 15.3, and if not, he will trust us. 

Back to the folded paper depicted in Figure 15.3. We see two developable 
surfaces intersecting along the ridge 7, and each carries a family of rulings. Thus 
we have two functions, (3\(t) and /3 2 (*) ■ Unfolding the surfaces back in the plane 
yields a planar curve, the fold S, with two families of rulings along it, one on each 
side, see Figure 15.7. 




Figure 15.7. Unfolding the folded paper 



The angles (i\ and /3 2 arc given by the formulas 



cot/?i(i) 



a'(t)-r(t) 
n(t) sina(i) 



cot/3 2 (t) 



-a'(t)-r(t) 
n(t) sina(t) 
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the first being (15.2), and the second obtained from the first by replacing a by 
7r — a. It follows that 

(15.3) cot ft (t) + cotfoCt) = ~ 2 . T(t) m , cot ft (t)- cot ft (t) = J".®,,. - 

K(t)sma(t) «(t)sma(t) 

Formulas (15.3) have two interesting consequences. Suppose that the ridge is 
a planar curve. Then r = 0, and therefore ft + ft = 7T. In this case unfolding the 
two sheets in the plane yields the straight rulings that extend each other on both 
sides of the fold, see Figure 15.8. Suppose now that the dihedral angle between two 
sheets along the ridge is constant. Then a' — 0, and therefore ft = ft. In this case 
the rulings make equal angles with the fold. 




Figure 15.8. Rulings on both sides of the fold may extend each other 



And again we may reverse the situation: start with the fold 5 and prescribe the 
angles ft and ft. The reader with a taste for further experimentation may paste, 
with scotch tape, a number of pins or needles on both sides of the fold (thus fixing 
the angles ft and ft.) Now fold! 

15.6 Two examples. As the first example, let the fold be an arc of a circle, 
and let the rulings on both sides be radial lines, orthogonal to the fold. Then 
ft = ft = 7r/2. Therefore the ridge is planar and the dihedral angle between the 
sheets is constant. The rulings on each sheet intersect at one point, hence both 
sheets are cones, see Figure 15.9. 

In the second example, we utilize the optical property of the parabola: the 
family of rays from the focus reflects to the family of rays, parallel to the axis of 
the parabola, see Figure 15.10 (the reader not familiar with this property should 
cither solve Exercise 15.4 or Wt lit for a discussion in Lecture 28). 

Let the fold be a parabola, let the rulings on the convex side be parallel to the 
axis and let the extensions of the rulings on the concave side all pass through the 
focus. By the optical property, these rulings make equal angles with the parabola, 
hence the dihedral angle between the sheets is constant. One of the sheets is again 
a cone; the rulings on the other being parallel, it is a cylinder, see Figure 15.11. 

15.7 Historical notes. We learned that paper can be folded along curves 
from M. Kontsevich in 1994; he discovered this as an undergraduate student a long 
time ago. Among other things, Kontsevich noticed that the ridge tends to be a 
planar curve; this cannot be proved, unless some assumptions on elasticity of the 
folded material are made (our mathematical model of paper folding completely 
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Figure 15.11. One sheet is a cone, another a cylinder 
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ignored these issues). We published the results of our reflections on paper folding 
in [32]. We discovered that Theorem 15.1 was quite old: it was mentioned in [6]. 

Later we found that folding of non-stretchable material along curves was con- 
sidered before [24]. Duncan and Duncan studied the problem with an eye on engi- 
neering products made by folding and bending of a single sheet (such as sheet-metal 
duct- work or cardboard containers). 

We wonder whether this interesting subject has further antecedents. It remains 
for us to quote M. Berry's Law (posted on his web site 3 ): Nothing is ever discovered 
for the first time. 

15.8 Exercises. 

15.1. Let 7(£) be a smooth arc length parameterized closed curve of length L 
and winding number w. Let k(t) be the curvature of j(t). Find 



15.2. Let 7 be a smooth closed curve of length L and winding number w. Move 
every point of 7 in the normal direction a small distance e to obtain a curve j e . 
Find the length of -f E . 

15.3. Prove formula (15.2). 

15.4. Prove the optical property of the parabola. 

15.5. Let the fold line be an arc of an ellipse, and let the rulings on one side 
of the fold line pass through one focus and on the other side though another focus. 
Prove that folding yields two cones making a constant angle along the ridge. 

Hint. Use the optical property of ellipses, Lecture 28. 

15.6. Why does one need to make a hole in a piece of paper when folding along 
a closed curve? 



3 www. phy.bris.ac.uk/people/berry_mv/quotations. html 
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LECTURE 16 

Straight Lines on Curved Surfaces 

16.1 What is a surface? We would prefer to avoid answering this question 
honestly, but to prove theorems we need precise definitions. 




FIGURE 16.1. Definition of a surface 

A set S in space is called a surface if for every point A in S there exists a plane 
P and a positive number r such that the intersection of S with any ball of radius 
< r centered at A has a 1 — 1 projection onto the plane P (Figure 16.1). Planes, 
spheres, cylinders, paraboloids, etc., are all surfaces. 
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Some surfaces, however curved they look, contain whole straight line (like the 
one shown in Figure 16.2). 




Figure 16.2. This surface contains a straight line 

In this lecture, we shall consider surfaces which contain very many straight 
lines. 

16.2 Ruled surfaces. A surface S is called ruled, if for every point A in S, 
there exists a straight line I through A contained in S. 

There are many ruled surfaces. A plane is a ruled surface, but this is not 
interesting. Some other ruled surfaces, like cylinders, readily display their rulings. 
Some surfaces are also ruled, but this is less visible; for example, if you bend 
(without folding) a piece of paper, you will obtain a ruled surface - see Lecture 13. 
Here we shall be interested in a different class of surfaces. 




FIGURE 16.3. A one-sheeted hyperboloid 

16.3 Two key examples. A one-sheeted hyperboloid is described in space by 
the equation 

2,2 2i 

x + y — z = 1. 

It may be also described as a surface of revolution of the hyperbola x 2 — z 2 = 1 in 
the xz plane about the z axis (Figure 16.3). 
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Do you think, this surface is ruled? It is. To sec it, make a cylinder of vertical 
threads joining two identical horizontal hoops, and then rotate the upper hoop 
about the vertical axis keeping the threads tight. Your cylinder will become a 
hyperboloid, and you will see the ruling (Figure 16.4). 




FIGURE 16.4. Twisting a cylinder provides a ruling of a hyperboloid 

Moreover, there exists a second ruling of the same surface: just rotate the hoop 
by the same angle in the opposite direction, see Figure 16.5 for a picture of both 
rulings (in fact, we shall get the mirror image of our hyperboloid, but being a surface 
of revolution, it is symmetric in any plane through the axis, and hence coincides 
with its mirror image). To obtain the hyperboloid described by the equation above 
one needs to make a special choice of the size of the cylinder and the angle of 
rotation; we leave details to the reader. 




Figure 16.5. A one-sheeted hyperboloid is doubly ruled 

One more example: a hyperbolic paraboloid. This can be described by a very 
simple equation: 

z = xy 
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(Figure 16.6, left). It resembles a horse saddle, or a landscape near a mountain pass. 
The easiest way to construct a ruling of this surface is to take its intersections with 
planes x — c parallel to the yz plane. The intersection is given (in the coordinates 
y, z in the plane x — c) by the equation z — cy; it is a straight line. Again, this 
surface has another ruling: take its intersections with the planes y = c (Figure 16.6, 
right). 




Figure 16.6. A hyperbolic paraboloid 



16.4 Doubly ruled surfaces. The two surfaces described above are doubly 
ruled which means that for every point A of any of these surfaces, there are two 
different lines, £\ and £2, through A contained in the surface. One can obtain 
further examples of doubly ruled surfaces by compressing the previous surfaces 
toward planes, or stretching them from planes. Speaking more formally, there 
are doubly ruled surfaces described by the equations x 2 + y 2 — z 2 = 1, z = xy 
in arbitrary, not necessarily rectangular, coordinate systems. What is amazing, is 
that there are no other doubly ruled surfaces (we shall give a more precise statement 
below). But we shall begin with a proposition which more or less rules out triply 
ruled surfaces. 

16.5 There are no non-planar triply ruled surfaces. A triply ruled sur- 
face should be defined as a surface such that for any point there exist three different 
lines through this point contained in the surface. We want to prove that, essentially, 
a triply ruled surface should be a plane. We begin with a statement that a much 
milder condition imposes a devastating restriction on the geometry of a surface. 

Theorem 16.1. Let S be a ruled surface, and let A e S be a point such that 
there are three different lines £1,^2, £3 through A contained in S. Then either S 
contains a planar disc centered at A, or S consists of straight lines passing through 
A. 
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Figure 16.7. The angles 0:1,0:2,0:3 are disjoint 



Proof. According to the definition of a surface, a piece of S around A has a 1 — 1 
projection onto a domain D in a plane P. Denote by A', l^l^,^ the projections of 
A, £\,£<i, £3 onto D. Take a point B in S, sufficiently close to A and such that the 
line BA does not belong to the surface (if no such point exists, then the surface 5* 
consists of lines passing through A) . Let £ be a line through B (not passing through 
A) contained in S. Let B' ,£' be projections of B,£ onto the plane P. Then, if B is 
sufficiently close to A, the line £' must intersect, within our domain D, at least two 
of the lines £' l ,£' 2 ,£' 3 . (Indeed, if B' is sufficiently close to A', then lines through 
B' which do not intersect £[ in D form a small angle 01 at B', and similar small 
angles, 02 and 03, arise for £' 2 and £' 3 - see Figure 16.7. These three angles are 
disjoint, so a line through B' can miss, in D, at most one of the lines £[, £' 2 , £' 3 .) Let 
£' cross £[ and £' 2 . Then £, £ x , £2 also cross each other in three different points and, 
in particular belong to some plane, Q. Moreover, for every point C G S sufficiently 
close to A, the line in the surface through C must cross, in two different points, 
at least two of the lines £,£i,£ 2 (the proof is identical to the previous proof), and, 
hence C also belongs to Q. □ 

Corollary 16.2. Locally, a triply ruled surface is a plane; that is, for every 
point of a triply ruled surface, there is a planar disc centered at this point and 
contained in the surface. 

Proof. According to Theorem 16.1, the surface, in a proximity of A, is either 
planar, or conical. But it is obvious that if a conical surface with the vertex at A 
contains a line not passing through A, then it is planar. □ 

(A better looking statement, which we leave to the reader to understand and 
to prove, says that every connected triply ruled surface is a plane.) 
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Our next goal is to describe all doubly ruled surfaces in space. For this purpose, 
we have to consider one very special class of surfaces. 

16.6 Surfaces generated by triples of lines. Imagine that you have a job 
of writing problem sections for textbooks in mathematics. Imagine further that 
your assignment is to compose problems for a chapter dealing with equations of 
lines and planes in space. Say, you write: find an equation of a plane passing 
through three given points. It is a "good" problem: it always has a solution, and 
this solution is usually unique (it is unique, unless, by accident, the three point are 
"collincar", that is, belong to one line). The problem, however, will not be good, 
if you give four points (the problem, usually, will have no solutions) or two points 
(there will be infinitely many solutions) . Or a problem requires to find an equation 
of a plane through a given point parallel to - how many? - given lines. The answer 
to "how many?" is two: you take lines parallel to the given lines through the 
given point, and if there are two lines, they determine (if they do not coincide) a 
unique plane. Just for fun, think about the following problem: find a line (in space) 
crossing - how many? - given lines. How many lines should we give to make the 
problem good? We shall give the answer in the end of Section 16.7, so that you 
have time to think about it. And for the time being, we shall consider a simpler 
problem. 

PROPOSITION 16.1. Let A,£i,£ 2 be a point and two lines in space such that A 
does not belong to either £\ or l 2 an d all three do not belong to one plane. Then 
there exists a unique line through A which is coplanar with (that is, crossing or 
parallel to) both t\ and l 2 . 



Proof. (See Figure 16.8.) Let P\ be the plane containing A and l\ and P 2 be 
the plane containing A and £ 2 - The assumptions in the Proposition imply that 
such planes Pi, P 2 exist, are unique and different. Since they are not parallel (both 
contain A), their intersection is a line. This line I satisfies the conditions of the 
proposition, and such a line is unique, because it must belong to both P\ and P 2 . 
□ 




4 



Figure 16.8. Proof of Proposition 16.1 
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As an immediate application, we single out a property of the two doubly ruled 
surfaces considered in Section 16.3. Obviously a line belonging to the first family 
of lines (first ruling) of the one-sheeted hyperboloid is coplanar with any line of 
the second family. Take any three lines, l\,£i,lz, from the first family. Then the 
second family consists precisely of the lines coplanar with all three. Indeed, all the 
lines of the second family have this property, so we only need to show that any line 
£ with this property belongs to this family. Let £ cross the line, say, i\ at a point A 
(it cannot be parallel to all three) . There is a line of the second family through A. 
It must be coplanar with £2 and £3, so it must be £, since a line with this property is 
unique. Thus, the one-sheeted hyperboloid is the union of all lines coplanar to any 
three pairwise skew lines contained in the hyperboloid. Precisely the same is true 
(and is proved in the same way) for the hyperbolic paraboloid (with an additional 
remark that no two lines on the hyperbolic paraboloid are parallel) . 

16.7 Equations for surfaces generated by triples of lines. Let l\,l?.,£z 
be three pairwise skew (not coplanar) lines in space. Consider straight lines coplanar 
to all these three lines. Actually, according to Proposition 16.1, there is one such 
line passing through every point of £3, and there is one more line, parallel to £3 and 
crossing £\ and £ 2 - The union S of all such lines is a ruled surface (we shall not 
check that it is a surface, since it will follow from further results). We shall call S 
a surface generated by the lines £1,^2, £3- 

Theorem 16.3. Let S be a surface generated by pairwise skew lines £1,^2^3- 

(1) // the lines £±,£2^3 are not parallel to one plane, then S is described in some 
(possibly, skew) coordinate system by the equation x 2 + y 2 — z 2 = 1. 

(2) Otherwise, S is described in some coordinate system by the equation z = xy. 

Proof. Let us begin with Part (1). Let the lines £1,^2 7 £3 be not parallel to one 
plane. 

Lemma 16.2. There exists a coordinate system with respect to which the lines 
are described by the equations 

(£1) x = -z,y = l, 
(£ 2 ) x = z,y = -l, 
(£3) x — 1, y = z 

(that is, consist of points (t, 1, — t), (t, —l,t),(l,t,t)). 

Proof of Lemma. Let £[, i = 1, 2, 3, be a line parallel to £i and crossing the two 
remaining lines £j,j ^ i. (This line exists and is unique. Indeed, choose j ^ i and 
take the plane P formed by lines, parallel to £i and crossing £j. This plane is not 
parallel to the third line, £t, otherwise it is parallel to all three lines. Let C be the 
intersection point of P and £^. The line through C parallel to £i crosses both £i 
and £j] it is £[.) The lines £\, £' 2 , £3 , £[,£2, £'3 form a spatial hexagon with parallel 
opposite sides. We denote its vertices by ABCDEF (where A is the intersection 
point of £\ and £' 2 , B is the intersection point of £' 2 and £3, etc. - see Figure 16.9, 
left). The opposite sides of this hexagon are parallel (as we already mentioned), 
but also have equal lengths: we have 

All = 7d3 + Blj + CD = Al? + F~E + ED, 

and, since a presentation of a vector as a sum of vectors collinear to £1,^2,^3 is 
unique, one must have AF = CD, AB = ED, BC = FE. 
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This implies that the hexagon ABCDEF is centrally symmetric; take its center 
of symmetry, O, for the origin of a coordinate system. For e 1; take the vector from 
O to the midpoint of BC; for e 2 the vector OA; for e 3 the vector OB — e 1 — e 2 . 
In the coordinate system with the origin O and coordinate vectors ei,e2,e3 the 
points A, B,C, D, E, F will have coordinates as shown in Figure 16.9, right. We 
see that the points F, A have coordinates satisfying the equations x = —z,y = 1, 
and hence the latter are the equations of the line l\. Similarly the coordinates of 
P,P satisfy the equations x = z,y = — 1 and the coordinates of B,C satisfy the 
equations x = 1, y = z, so these equations are those of lines £2 and £3. □ 



Lemma 16.2 implies Part (1) of Theorem 16.3. Indeed, the lines in the Lemma 
obviously belong to the surface 5" presented (in our coordinate system) by the 
equation x 2 + y 2 — z 2 = 1 ; The points of S' correspond to points of the one-sheeted 
hyperboloid (presented by the same equation in the standard coordinate system), 
and this correspondence takes lines into lines. Since the hyperboloid is the union 
of lines coplanar with any three pairwise skew lines on it, the same is true for S', 
that is, S' is the union of lines coplanar with £\,£i,£?>- Thus, S' is S. 

Turn now to Part (2). Let the lines £1,^2,^3 he in parallel planes, Pi,P 2 ,P 3 . 
Assume also that P 2 lies between P 1 and P 3 , and that the ratio of the distances 
from Pi to P 2 and from P 2 to P 3 equals a. 

Lemma 16.3. There exists a coordinate system in which the lines are described 
by the equations 



Proof of Lemma. Fix two lines, mi , m 2 , crossing the lines £1,(2, £3 in the points 
Ai,A 2 ; Pi, P 2 ; Ci, C 2 respectively (see Figure 16.10, left). Take Pi for the origin of 
the coordinate system and define coordinate vectors as ei = PiP 2 , e 2 = P1C1, e 3 = 



B1C2 — ei — e 2 . Then the coordinates of the points Pi, P 2 , C\, C 2 are as shown in 
Figure 16.10, right. Furthermore, since the lines £1^2, £3 lie i n parallel planes, 




Figure 16.9. Proof of Lemma 16.2 



(4) 
(40 
(4) 



y = —a, z = —ax 



y = 0,z = 0, 
y=l,Z = x. 



l^lPlI = |^ 2 P 2 | 
\Bid\ \B 2 C 2 \ 



= a 
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and hence the coordinates of the points Ai,A 2 are (0, 0, 0) — a((0, 1,0) — (0, 0, 0)) = 
(0, -a, 0) and (1, 0, 0) - a((l, 1, 1) - (1, 0, 0)) = (1, -a, -a). This implies that the 
lines ^1,^2,^3 have equations as stated. □ 



Lemma 16.3 implies Part (2) of Theorem 16.3 precisely as Lemma 16.2 implies 
Part (1). □ 

In conclusion, let us answer the question left unanswered in the beginning of 
Section 16.6. If we want the problem of constructing a line coplanar with a certain 
number of skew lines to be good, the number of lines should be four. Indeed, the 
lines coplanar with the first three form a surface presented by an equation of degree 
2. The fourth line intersects this surface in 2, or 1, or points, and each of these 
points is contained in a line coplanar with the first three lines. Thus, the number 
of solution is 2, 1, or 0: just as for a quadratic equation. 

16.8 There are no other doubly ruled surfaces. 

Theorem 16.4. Let S be a doubly ruled surface containing no planar discs. 
Then for every point A in S there exists a surface So generated by three lines such 
that within some ball around A the surfaces S and So coincide. 

Remark 16.4. It may be deduced from Theorem 16.4 that any connected non- 
planar doubly ruled surface is generated by three skew lines, and hence, according 
to the previous theorem, is described, in some coordinate system by one of the 
equations x 2 + y 2 — z 2 = 1 or z = xy. We leave a proof of this statement to the 
reader. 

Proof of Theorem 16.4. Since the surface is not planar, there are only two lines 
passing through A and contained in S; let £i,£ 2 be these two lines. 

A piece of the surface around A has a 1 — 1 projection onto a domain D in a 
plane. Let A' , £[,£'2 be the projections of A,£i,£ 2 (Figure 16.11, left). For a point 
B in S, sufficiently close to A, any line through its image B' in D crosses either 
£[ or £' 2 . Let mi,m 2 be the two lines in S through point B, and let mj,m' 2 be 
their images in D. Then each of m[, m 2 crosses in D one of £[, £' 2 . But neither of 
then can cross both: if, say, m[ crosses both £[,£'2, then the lines m\,l\,l 2 form a 
triangle, and the planar interior of this triangle should be contained in S (any line 
through any point C inside this triangle crosses the contour of this triangle in two 
points, and, hence, is contained in the plane of the triangle). For the same reason, 




Figure 16.10. Proof of Lemma 16.3 
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the two lines m[, m! 2 cannot cross the same line, £' x or £' 2 . Thus, in some proximity 
of A, every line in S through any point of S, not on l\ and £ 2 , crosses precisely one 
of these lines. 

This enables us to speak of two families of lines in S: lines crossing £ 2 (including 
t\) form "the first family", while lines crossing l\ (including £ 2 ) form "the second 
family". Obviously: (1) within each family, the lines do not cross each other; (2) 
every line of each family crosses every line of the other family; (3) the lines of each 
family cover the whole surface (a whole piece around A) - see Figure 16.11, right. 
This shows that the surface is generated (in a proximity of A) by any three lines of 
any of the two families. □ 



Thus, at least locally, every non-planar doubly ruled surface is either a one- 
sheeted hyperboloid, or a hyperbolic paraboloid. 

16.9 Shadow theater. In conclusion, we shall consider configurations of shad- 
ows of rulings of a doubly ruled surface on a flat screen. We shall restrict ourselves 
to the case of a one-sheeted hyperboloid made of two identical round hoops and 
a couple of dozens of identical rods representing the two rulings (sec Figure 16.5). 
(See Exercise 16.6 for the case of a hyperbolic paraboloid.) 

First assume that the rays of light are all parallel to each other (that is, the 
source of light is at infinity) and to one of the lines, say, £, on the hyperboloid. 
First ignore the hoops (that is, assume the lines very long and the distance from 
the screen very big). The shadow of I will be one point (let it be A). One of 
the lines of the second family (say, £') is parallel to £; its shadow will be also one 
point (say, A'). Any line from the first family, with the exception of £, will cross 
£'; hence its shadow will pass through A'. Similarly, the shadows of all the lines 
from the second family will pass through A. Hence, the shadows of all lines on the 
hyperboloid will be the lines on the screen passing through one of the points A or 
A' (but not through both) - see Figure 16.12. 

Now, add the hoops. Their shadows will be equal ellipses Ei,E 2 (or circles, if 
we make them parallel to the screen; certainly, in this case the rays of light will not 
be perpendicular to the screen). Since both £ and £' cross the hoops, the ellipses 




Figure 16.11. Proof of Theorem 16.4 



LECTURE 16. STRAIGHT LINES ON CURVED SURFACES 



229 




Figure 16.12. Shadows of rulings 




Figure 16.13. Shadows of hoops and rods 



Ei,E 2 both pass through A and A' . If s is the shadow of the line m of the second 
family, that is, s passes through A, then s crosses E\ in two points, A and B, and 
crosses E% in two points, A and B'; the segment BB' will be the shadow of the line 
to between the hoops. In the same way, one can draw the shadows of the lines of 
the first family (between the hoops). See Figure 16.13. 

Consider now a different configuration. Place a one point light source at a point 
L on the hypcrboloid, denote by £ and £' the two lines through L and make the 
screen parallel to £ and £' (so the hyperboloid lies between the light source and the 
screen - see Figure 16.14). 

The lines £ and £' cast no shadow. Let m ^ £ be a line on the hyperboloid from 
the same family as £; it crosses £' at some point M (or is parallel to £'). The shadow 
of to is the intersection line of the screen and the plane of £' and m; in particular, it 
is parallel to £'. Similarly, the shadows of the lines of the second family are parallel 
to £. Thus, if we ignore the hoops, the configuration of the shadows will be that of 
two families of parallel lines (see Figure 16.15, left). 

Now, let A, A' be the intersection points of the lines £, £' with the first hoop, 
and B, B' be the intersection points of these lines with the second hoop. Of the 
two arcs AA' of the first hoop, one does not produce any shadow; the shadow of the 
second (bigger) arc is one branch of a hyperbola with asymptotes parallel to £ and 
£'. The shadow of the big arc BB' of the second hoop is one branch of a different 
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Figure 16.14. A projection from a point on the hyperboloid 

hyperbola, also with asymptotes parallel to £, £' . The whole picture is presented on 
Figure 16.15, right. Notice that it is centrally symmetric: the center of symmetry 
is the shadow of the point V opposite to L (in other words, the intersection point 
of the lines on the hyperboloid parallel to I and £' . 




Figure 16.15. Shadows at the screen 



16.10 Exercises. 

16.1. Two points, A and B are moving at constant speeds along two skew lines 
in space. Which surface does the line AB sweep? 

16.2. Prove that any non-planar quadrilateral is contained in a unique hyper- 
bolic paraboloid. 
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16.3. Let ABCD be a spatial quadrilateral. Consider points K, L, M, N on 

the sides AB, BC, CD, DA such that — — = and — — = -=—r- Prove that the 

AB CD BO DA 
segments KM and LN have a common point. 

16.4. Let £i,£2, £3 be three lines such that £\ and £ 2 are coplanar (but different) 
and £3 is skew to both of them. What surface do the lines £1,(2, £3 generate? (In 
other words, what is the union of all lines coplanar with all the three lines?) 

Hint. Consider separately the cases when the line £3 is parallel to the plane of 
the lines £1,^2 or crosses this plane. 

16.5. Find all lines coplanar with the four lines 



X = 


1, 


z 


= y; 


X = 


0, 


z 


-0; 


X = 


-1, 


z 


= -y\ 


X = 


y, 


z 


= 4. 



16.6. A hyperbolic paraboloid is projected onto a screen in the direction parallel 

(a) to one of the rulings; 

(b) to the two planes to which all rulings are parallel. 
How will the projections of all the rulings look like? 




LECTURE 17 
Twenty Seven Lines 

17.1 Introduction. We saw in Lecture 16 that some surfaces of degree 2 are 
totally made up of straight lines; moreover, they are doubly ruled. We remarked 
there that if we count not only real but also complex lines, then all surfaces of 
degree two, even spheres and paraboloids, become doubly ruled. 

If we adopt an algebraic approach to geometry, then the next step after surfaces 
of degree 2 should be surfaces of degree 3. But while the geometry of surfaces (and, 
certainly, curves) of degree 2 was well understood by the Greeks millennia ago, the 
systematic study of surfaces (and curves) of degree 3 was not started before the 
19-th century. 

Now there are books dedicated to the "cubic geometry" (let us mention "The 
non-singular cubic surfaces" by B. Segre [71] and "Cubic forms" by Yu. Manin 
[53]). Cubic geometry is very much different from classical "quadratic geometry". 
In particular, cubic surfaces are not ruled, in general. But still they contain abun- 
dant, although finite, families of straight lines. (By the way, surfaces of degree > 3 
usually do not contain any straight lines.) Geometers of the 19-th century, like 
Salmon and Cayley, found an answer to a natural question: 

How many straight lines does a surface of degree 3 contain? 

The answer is: twenty seven. 

17.2 "How many?" is this a good question to ask? The question 
makes sense in algebraic geometry, that is, in the geometry of curves and surfaces 
given by algebraic (polynomial) equations. Such curves and surfaces have degrees, 
which are the degrees of the polynomials. 

For example, how many common points do two lines in the plane have? The 
right answer is 1, although it may be also (if the lines are parallel) or oo (if they 
coincide). In the first case we may say that the point is "infinite" and still count 
it. So the result is 1 or oo. 

Consider now a curve of degree two. It may be an ellipse, a hyperbola, a 
parabola, or something more degenerate, like a pair of lines. We can say that a 
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curve of degree 2 and a line have 2, 1, 0, or oo common points. But the cases of 

1 or are disputable. If there is only one point, this means that cither we have a 
tangent, or two colliding points, or a line parallel to an asymptote of a hyperbola 
or the axis of a parabola; in these cases the "second point" is infinite. means that 
we have complex points (points with complex coordinates satisfying the equations 
of both the line and of the curve), or both points at the infinity (this happens if 
our line is an asymptote of a hyperbola) . But if we count each point as many times 
as it warrants and do not neglect complex or infinite solutions, then our answer is: 

2 or oo. 

Similarly, curves of degrees m and n must have ran or oo common points (the 
Bczout theorem). 



Figure 17.1. The curve x 3 — x 2 y + y 2 with inflection points shown 

Informally speaking, if a problem of algebraic geometry has finitely many so- 
lutions, then the number of solutions depends only on the degrees of curves and 
surfaces involved. Certainly, this becomes false if we are interested only in real 
solutions. What is worse, for some problems, it is never possible that all the solu- 
tions are real. For example, it is known that a curve of degree 3, not containing a 
straight line, has precisely 9 inflection points. But no more than 3 of them are real. 
A curve of degree 3 with 3 real inflection points is shown in Figure 17.1. (Another 
curve of degree 3 with 3 real inflection points is shown on Figure 18.6.) For the 
reader's convenience, we marked an asymptote of the curve and indicated the three 
inflection points by arrows. 

17.3 Main result. 

Theorem 17.1. A surface of degree 3 contains 27 or oo straight lines. 

17.4 An auxiliary problem: double tangents. A double tangent to a 
curve, or a surface, is a straight line that is tangent to the curve or surface at two 
different points. A point of tangency is counted as two (or more) intersection points 
of a line and a curve or a surface. Hence, curves or surfaces of degree < 4 never 
have double tangents that are not contained in them. 
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Important Observation. A double tangent to a surface of degree 3 is contained 
in this surface. 

Consider now curves of degree 4 in the plane. 

Question. How many double tangents does a curve of degree 4 have? 
Answer: 28 

We refrain from giving a full proof of this. 1 We restrict ourselves to constructing 
a curve of degree 4 with 28 real, finite, different double tangents. Consider the 
polynomial 

p(x, y) = (4.x 2 + y 2 - \){x 2 + Ay 2 - 1). 

It has degree 4. The equation p(x, y) = defines in the plane an "elliptic cross" 
(see Figure 17.2, left). The cross divides the plane into 6 domains. The function 
p(x, y) is positive in the outer (unbounded) domain and in the central domain, and 
is negative in the petals. Choose a very small positive e and consider the curve 
p(x, y) + e = 0, also of degree 4. It consists of four ovals within the petals of the 
previous cross. 2 These ovals are very close to the boundaries of the petals. 




Figure 17.2. Construction of a curve 



Every two ovals have (at least, but, actually, precisely) 4 common tangents: 
two exterior and two interior. Also the ovals are not convex (their shape is close to 
that of the petals) , and each of them has a double tangent of its own. Total: 

•4 + 4 = 28. 

17.5 Surfaces of degree 3 and curves of degree 4. Let S be a surface of 
degree 3 given by the equation 

p s (x,y,z) +p 2 (x,y,z) + p 1 (x,y,z) + c = 

where pi,P2,P3 are homogeneous polynomials of degrees 1,2,3. Suppose that = 
(0,0,0) G S, which means that c = 0. Consider a line passing through 0; it consists 



This can be deduced from the Pliickcr formulas, see Lecture 12. 
2 The term "oval" is used elsewhere in this book as synonymous to closed strictly convex 
smooth curve. In real algebraic geometry, an oval of an algebraic curve is its component that 
bounds a topological disc. 
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Figure 17.3. 28 double tangents 



of points with proportional coordinates, say, 

(17.1) x = at,y = /3t,z = yt (a, /3, 7 ) ? (0, 0, 0). 

This line crosses S at 0, and at two other points. Mark our line, if these two points 
coincide, that is, each marked line crosses S at 0, tangent to S at some point T, 
and has no common points with S, besides and T. Consider intersections of the 
marked lines with a screen. We get a curve in the screen, denote it by L. 




FIGURE 17.4. A projection of the surface on a screen 
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Thus, if P <E L, then the line through and P is tangent to S at some point, 
T(P) £ S. Note that if / is a tangent line to L at P, then the plane p containing 
and I, is tangent to S at T(P). 

Let us show now that the curve L has degree 4. To find the intersection of the 
line (17.1) with S, plug (17.1) into the equation of S: 

p 3 {a, (3, j)t 3 + p 2 (a, 13, 7 )i 2 + Pl {a, (3, 7 )t = 0. 

One solution of this equation is 0, the other two coincide if and only if 

D(a, (3, 7) = p 2 (a, (3, 7 ) 2 - 4p 3 (a, (3, 7 )pi(a, (3, 7) = 0. 

The intersection of the line (17.1) and the plane z = 1 corresponds to t = 7 _1 (if 
7 = 0, then there is no intersection; this possibility corresponds to the points of L 
"at infinity" ; there must be 4 such points) . This intersection has the coordinates 
(x, y, 1) where x — a/7, y = f3/j. The equation £>(a,/3, 7 ) = may be rewritten 
as D(x, y, l)j 4 — 0, that is, D(x, y, 1) = 0. This is an equation of degree 4. 

Let now I be one of the 28 double tangents to L, with the tangency points P\ 
and P 2 . The plane p containing and I is tangent to S at T(Pi) and at T(P2). 
Hence, the line through T(P 1 ) and T(P 2 ) is tangent to S at T(P 1 ) and at T(P 2 ), 
which is possible only if it is contained in S (see Important Observation in Section 
17.4). This proves our theorem modulo the last, and rather unexpected, question. 




Figure 17.5. From double tangents to lines on the surface 

17.6 Twenty eight or twenty seven? Seemingly, we have constructed 28 
straight lines within S. Let us show that one of them is a mirage. 

Who can swear that if P = (x,y,l) G L, then T(P) ^ 0? The equality 
T(P) = holds, if and only if the line (17.1) has a triple intersection with S. This 
means that the equation 

p 3 (x, y, l)t 3 + p 2 (x, y, l)t 2 + Pl (x, y, l)t = 
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has three coinciding solutions, t\ = ti = £3 = 0, which happens if and only if 
p 2 (x 7 y,l) — and pi(x,y, 1) = 0. These two equations describe a line and a 
curve of degree two in the plane with coordinates x 7 y; so it has two solutions. 
Geometrically, this means that there are two lines intersecting S only at 0: all the 
three intersection points merge. These two lines generate the tangent plane po to 
S at 0; they intersect the plane z = 1 at two points of the curve L, and the plane 
Po intersect the plane z — 1 in a line tangent to L at these two points. This double 
tangent to L does not correspond to any line in S. Thus, we have "only" 28 — 1 = 27 
lines in S. 

17.7 All these lines can be real. Consider the surface 

(17.2) 4(x 3 + y 3 + z 3 ) = (x + y + z) 3 + 3(x + y + z). 

It is shown in Figure 17.6; the vertical axis in this picture is the "diagonal" x = 
y = z. 




Figure 17.6. A cubic surface (17.2) 



Theorem 17.2. The surface (17.2) contains 27 real straight lines. 

All 27 lines on the surface (17.2) are shown in Figure 17.7 - you can try to count 
them. Still this figure looks rather messy, but the proof of Theorem 17.2 given below 
may shed some light on the construction of the lines and their behavior. 
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Figure 17.7. The surface with 27 lines 



Proof. Nine of the lines axe obvious: 
(l)|fl°. (2) 



X = 





y = 


— z 


X = 


1 


y = 


— z 


X = 


-1 


y = 


— z 



y = 


i! 


(3) < 


z = 





z = 


—x 


x = 


-y 


y = 

z = 


1 

— X 


(6) < 


z = 
X = 


l 

-y 


y = 


-1 


(9) < 


z = 


-l 


z 


—x 


X = 


-y 



(4) i " (5) 

(7) { X= ~ l (8) 

y = — z 

(each of these equations implies x 3 + y 3 + z 3 = x + y + z = (x + y + z) 3 ). These 
lines lie in three parallel planes: x + y + z = 0, x + y + z = 1, x + y + z = —1; 
in the first of these planes the lines all meet at the point (0,0,0), in the other two 
planes the lines form equilateral triangles. 

For the remaining 18 lines we introduce, for further convenience, letter notation: 
a, b, . . . , r. Six of these lines have simple equations: 




(To find these equation, we consider the intersections of the surface (17.2) with 
the planes x — 0, y — and z = 0. Say, plug x = into the equation (2): 
4(y 3 + z 3 ) = {y + z) 3 + 3(y + z), hence 3x 3 + 3y 3 = 3yz(y + z) + 3(y + z), hence 
either y + z = 0, or y 2 = yz + z 2 = yz + 1, that is {y — z) 2 = 1, y — z = ±1. One of 
the three equation obtained is that of the line (1), the other two are (f) and (g). 
The cases y = 0, z — are treated in a similar way.) 




Figure 17.9. Sections of the surface (17.2), continued 
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The equations of the remaining 12 lines involve the "golden ratio" ip = 
The equations are: 

. , f z = y>(y + z) . . f y = <p(z + x) ... f z = y>(x + y) 

[ y = z + </? [ z = a; + t,o [ x = y + 

(1) { X ^ L P(V + Z ) t d \ \ V = L P( Z + X ) f z = <^(x + y) 

1 J/ = Z — <p 1 2=1- </? 1 I = J/-^ 



£ = + ( y = -<p \z + x) . , / z = -^ H^ + y) 



and 

(o) ^ " - * (q) <i ; ' - - . (rn) 

[ y = z + ip [z = x + ip [ x = y + ip 

J x = -ip- 1 (y + z) . . f y = -^(z + x) / n x f 2 = -^ 1 ( a; + y) 

(we leave plugging these 12 equations into the equation (17.2) and verifying that 
these lines lie on the surface to the reader) . 

The diagrams in Figures 17.8, 17.9 show the sections of our surface by 12 
different planes of the form x+y+z = const (centered at the point with x = y = z). 
The traces of the lines (a) — (r) are also shown. You can see that in each of the 
domains x + y + z > 1 and x + y + z < 1 the surface consists of a "central tube" and 
three "wings". In the domain —1 < x + y + z < 1 these wings and the tube merge 
together; there arc 9 lines, (1) — (9), contained in this domain. Of the remaining 18 
lines, 6 (three pairs of parallel lines, (m) — (r)) lie on the wings, and 12 (six pairs 
of parallel lines, (a) — (1)) lie on the central tube. The configuration of these lines 
is shown in Figure 17.10. □ 



17.8 Some other surfaces. There are other cubic surfaces with ample fam- 
ilies of real lines. We will briefly discuss some of them. 
Consider the family of surfaces 

(17.3) x 3 + y 3 + z 3 - 1 = a(x + y + z- l) 3 . 

Theorem 17.3. If a > ^ and a ^ 1, then the surface (17.3) contains 27 real 
lines. 

Proof. Three lines are obvious: {x = l,y = — z} and two more obtained by 
switching x with y and z. Four more are {x = u, y + z = 0}, {x = 1, y + uz = 0} 
where u is one of the solutions of the quadratic equation 

(a- l)(u- l) 2 = 3m 

and eight more are again obtained by switching x with y and z. Finally, four more 
are x + v 2 (y + z) = 0, y — z = 2v — v 3 (y + z)} where v is one of four solutions of 
the equation 

(4a-l)(v 2 - l) 2 = 3v 2 
and, once again, eight more are obtained by switching x with y and z. The total is 
27. □ 

In the case a = 1/4, the equation (17.3) determines a surface with "singular 
points (1, 1, 1), (1, — 1, — 1), (— 1, 1, — 1), (— 1, — 1, 1) (in a neighborhood of each of 
these points the surface looks like a cone; by the way, this "surface" is not a surface 
in the sense of definition given in Lecture 15.2). There are only 9 lines on this 
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k I a b 




Figure 17.10. Lines on the tube 



surface: {x = l,y = z}, {x = l,y = — z}, {x = —l,y = — z}, and 6 more can be 
obtained by switching x with y and z. 

The case a = 1 is especially interesting. To make this surface more attractive, 
it is reasonable to take a (non-rectangular) coordinate system such that the points 
(0,0,0), (1, 0, 0), (0, 1, 0), (0, 0, 1) (belonging to the surface) are vertices of a regular 
tetrahedron. Then all the symmetries of space which take the tetrahedron into 
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itself also take the surface into itself. We leave the work of finding the equations of 
the lines in this surface to the reader (see Exercise 17.1). 

17.9 The configuration of the 27 lines. One can easily see in Figures 17.8, 
17.9, and still better, in Figure 17.10, that there are many crossings between the 
27 lines. Actually, these crossings obey very strict rules, the same for all cubic 
surfaces. As usual, we shall not make a difference between crossing and parallel 
lines, so we shall speak rather of coplanar, than crossing, lines. The first property 
is obvious. 

Theorem 17.4. If some two of the lines in our surface are coplanar then there 
exists a unique third line in our surface belonging to the same plane. 

Proof. The intersection of a cubic surface with a plane is a cubic curve in 
the plane, that is, it may be presented by an equation of degree three. If this 
intersection contains two different lines, then the equation of the curve is divisible 
by the equations of the lines, and, after division, we obtain an equation of degree 
one, which is the equation of the third line. □ 

The following theorem characterizes the coplanarity properties between the 
lines completely. 

Theorem 17.5. Let £\ be any of the 27 lines in a cubic surface S . 

(1) There exists precisely 10 lines in S coplanar with l\; let us denote them by 
£2, ■ ■ ■ ,£n- These 10 lines can be arranged into 5 pairs of mutually coplanar lines, 
£2, £3', (-ii £5', ■ ■ ■ ; ^10) ^11 ■ other two lines among £2, ■ ■ ■ 7 ^11 are coplanar. 

(2) Each of the remaining 16 lines, £12, ■ ■ ■ ,£27, is coplanar with precisely one 
line of each of the pairs in (1). For any two of the lines £12, . . . , £27, the number of 
lines from £2 to coplanar with both, is odd. 

(3) Two of the lines £12, - ■ ■ ,£27 are coplanar if and only if there is precisely 
one of the lines £ 2 , ■ ■ ■ ,£\\ coplanar to both (that is, the odd number mentioned in 
Part (2) is one). 

It is remarkable that all these statements are true whichever of the 27 lines one 
takes for £\. 

We shall not prove this theorem, but for the surface in Section 17.7 it can be 
checked with the help of the diagrams (Figures 17.8, 17.9, 17.10) and/or equations. 
For example, the line a is coplanar to each of the lines 

(1),1; (5),h; (9),d; b,r; j,n. 

The 5 pairs in this formula are also shown. Any of the other lines is coplanar with 
one line from each pairs. For example: 

the line c is coplanar with 1, (5),d,b,j; 
the line f is coplanar with (1), (5), (9),r, n; 
the line m is coplanar with 1, (5),d,r, n. 

The quintuples for the lines c and f contain only one common line, (5), and the 
lines c and f are coplanar. On the contrary, the quintuples for the lines c and m 
contain 3 common lines, 1, (5), and d, and the lines c and m are not coplanar. 

Some other properties of the lines follow from Theorem 17.5. We leave them 
to the reader as exercises. 
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17.10 Conclusion. Other enumerative problems in algebraic geome- 
try. Problems of computing the number of algebraic curves of a given degree (say, 
straight lines) intersecting some other curves and/or tangent to some other curves 
became very popular recently because of their importance in modern theoretical 
physics (more specifically, in the quantum field theory, see [18, 44]). We will discuss 
here briefly one such problem, interesting by an unexpected result and a dramatic 
200 year long history. 

Question. Given 5 conies (= ellipses, hyperbolas, parabolas), how many conies 
are tangent to all of them? 

(Why 5? Because for 4 conies the number of conies tangent to them is infinite, 
and for a generic set of 6 conies, there are no conies tangent to all of them at all.) 

This problem was first considered by Steiner (whose theorem is mentioned in 
Section 29.5), who published, in the beginning of 19-th century, his result: there are 
7736 such conies. This result, however, seemed doubtful to many people. Several 
decades after Steiner's work, Dc Jonquieres repeated Steiner's computations, and 
got a different result. But Steiner's reputation in the mathematical community was 
so high that Dc Jonquieres did not dare to publish his work. Finally, the right 
answer was found, in 1864, by Chasles (whose other result is proved in Section 
28.6); there are 3264 conies tangent to 5 given conies. 

However, Chasles counted complex conies, and it remained unclear how many 
of them could be real. In 1997, Ronga, Tognoli and Vust found a family of 5 ellipses 
for which all 3264 tangent conies were real. And in 2005 Welschinger proved that 
for a family of 5 real conies whose interiors are pairwise disjoint at least 32 of the 
3264 conies tangent to them are real. 



John Smith 
January 23, 2010 



Martyn Green 
August 2, 1936 



Henry Williams 
June 6, 1944 
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17.11 Exercises. 

17.1. Find the equations of all lines in the surface 

a; 3 + y 3 + z 3 - 1 = (x + y + z - l) 3 
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Hints, (a) There are only 24 lines; the remaining three escape to infinity. 

(b) There are three lines through each of the vertices of the tetrahedron de- 
scribed in Section 17.8; they are parallel to the three sides of the face, opposite to 
this vertex. This gives 12 lines. 

(c) The equations of the remaining 12 lines involve the golden ratio. 

17.2. Find straight lines on the surface 

xyz + f}(x 2 + y 2 + z 2 ) = 7. 

17.3. Among the 27 lines on a cubic surface, there are precisely 45 coplanar 
triples of lines. 

Remark. Some coplanar triples in Section 17.7 consist of lines passing through 
one point (there are 7 such triples) or mutually parallel (there are two such triples) . 
These properties should be regarded as accidental, on a general cubic surface, these 
events do not happen. 

17.4. The maximal number of mutually non-coplanar lines is 6. There are 
precisely 72 such 6-tuples. 

17.5. The number of permutations of the 27 lines taking coplanar lines into 
coplanar lines is 51, 840 = 2 7 • 3 4 • 5. (These permutations form a group known in 
group theory as the group E§.) 




LECTURE 18 
Web Geometry 

18.1 Introduction. This lecture concerns web geometry, a relatively recent 
chapter of differential geometry. Web geometry was created mostly by an out- 
standing German geometer W. Blaschke and his collaborators in the 1920s. Web 
geometry is connected by many threads with other parts of geometry in particular, 
with the Pappus theorem discovered by Pappus of Alexandria in the 4th century 
A.D. A nice introduction to web geometry is a small book [8] by Blaschke (which 
unfortunately was never translated into English) and an article by his student, a 
great geometer of the 20-th century, S.-S. Chern [14]. 

In differential geometry, one is often concerned with local properties of geomet- 
rical objects. For example, we mentioned in Lecture 13 that a sheet of paper, no 
matter how small, cannot be bent so that it becomes part of a sphere. The invariant 
that distinguishes between the plane and the sphere is curvature, zero for the plane 
and positive for the sphere (cf. Lecture 20), and the allowed transformations are 
isometries (paper is not compressible or stretchablc) . In web geometry, the sup- 
ply of allowed deformations is even greater: one does not insist on preserving the 
distances and considers all diffcrcntiablc and invertiblc deformations of the plane. 

18.2 Definition and a few examples. A d-web in a plane domain consists 
of d families of smooth curves so that no two curves are tangent and through every 
point there passes exactly one curve from each family. We always assume that each 
family consists of level curves of a smooth function (this is a meaningful condition: 
see the example of osculating circles of a plane curve discussed in Lecture 10) ; note 
however that this function is not at all unique. 

Two d-webs are considered the same if there is a smooth deformation of the 
domain that takes one to another. 

For d — 1, there is nothing to study: one can deform the curves into horizontal 
lines. Likewise, if d = 2, one can deform both families so that they become horizon- 
tal and vertical lines. (Proof: if the families consist of the level curves of functions 
f(x,y) and g(x,y) then, in the new coordinates X = f(x,y), Y = g(x,y), the 



247 



248 



LECTURE 18. WEB GEOMETRY 



curves are horizontal and vertical lines.) Interesting things start to happen when 
d = 3. 

Let us consider a few examples. The simplest 3- web consists of three families 
of lines 

x = const, y = const, x + y = const. 

This 3-web is called trivial. 

Every 3-web consists of the level curves of three smooth functions f(x,y), 
g(x,y) and h(x,y); since no two curves in the web are tangent, the gradients of 
any two of these functions arc linearly independent. One has a simple triviality 
criterion for a 3-web. 

Lemma 18.1. If functions f,g,h can be chosen in such a way that 

(18.1) f + g + h = 

then the 3-web is trivial. 

Proof. As before, consider the new coordinates X = f(x,y), Y = g(x,y) in 
which the first two families consist of horizontal and vertical lines. Since h = —f—g, 
the third family has the equation X + Y = const. □ 

Our next example is a 3-web in the interior of triangle AiA 2 A 3 . The curves of 
the i-th family (i = 1,2,3) consist of the circles that pass through vertices Ai and 
A i+ i, see Figure 18.1 (we consider the indices mod 3: this convention makes sense 
of the notation, such as A i+ i, for i = 3). 



A 1 




Figure 18.1. A 3-web consisting of circles passing through pairs 
of vertices of a triangle 

A point P inside a triangle is uniquely characterized by the angles a = A1PA2, (3 = 
A2PA3 and 7 = A3PA1. Since an angle supported by a fixed chord of a circle has 
a fixed measure, the three families of circles have the equations: 

a = const, {3 = const, 7 = const. 

Since a + (3 + 7 = 2ir, one may take the functions 

/ = a - 2tt/3, g = 0- 2tt/3, h = 7 - 2tt/3 

as defining the 3-web. These functions satisfy (18.1), and hence this 3-web is trivial. 
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Next example: the 3-web in the interior of the first quadrant that consists of 
horizontal lines, vertical lines and lines through the origin, see Figure 18.2. This 
web has the equations 

x = const, y = const, y/x = const. 




Figure 18.2. A 3-web consisting of the horizontal lines, the ver- 
tical lines and the lines through the origin 

Another choice of defining functions is lnx, — lny and lny — hix. These three 
functions satisfy (18.1), and hence this 3-web is trivial as well. 

Our next example is a modification of the previous one, see Figure 18.3. This 
3-web inside a triangle A1A2A3 is made of the families of lines through the vertices 
of the triangle. Is this web trivial? 



A, 




Figure 18.3. A 3-web consisting of the lines though the vertices 
of a triangle 

Project the plane of the triangle AiA 2 A 3 on another plane, a screen, from a 
point O so that the lines OA2 and OA3 are parallel to the screen. Then the lines 
through point A2 project to parallel lines, and likewise for A3. Hence the projection 
of the 3-web in Figure 18.3 is the one in Figure 18.2, and therefore trivial. 
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18.3 Hexagonal webs. The reader might have developed an illusion that all 
3- webs are trivial. The truth is, a generic 3- web is not. 

To see this, consider the configuration in Figure 18.4. Pick a point O and draw 
the curves of the three families through it. Choose a point A on the first curve, 
draw the curve from the second family through it until its intersection with the 
third curve through O at point B, draw the curve from the first family through B 
until its intersection with the second curve through O at point C, etc. The final 
result of this spider-like activity is the point G on the first curve. For a trivial 
3- web, G = A; in general, not necessarily so. 




FIGURE 18.4. A hexagon in a 3- web 

A 3-web for which the hexagon in Figure 18.4 always closes up is called hexag- 
onal. A trivial 3-web is hexagonal. The converse is also true, see Exercise 18.5. 

18.4 Hexagonal rectilinear webs and cubic curves. A rectilinear web is 
a web whose curves are straight lines, see the examples in Figures 18.2 and 18.3. In 
this section we shall describe hexagonal rectilinear 3-webs. This subject is closely 
related, somewhat unexpectedly, with a classical result in geometry, the Pappus 
theorem. 

A generic 1-paramctcr family of lines consists of lines tangent to a curve; this 
was discussed in detail in Lectures 8 and 9. We have three such families, and they 
consist of lines tangent to three curves, say 71,72 and 73. 

Our web is assumed to be hexagonal, see Figure 18.5. Consider the dual con- 
figuration of points and lines (see Lecture 8 for a discussion of projective duality). 
This configuration is shown in Figure 18.5, right, where the points, dual to lines 
AB,BC, etc., are marked AB,BC, etc., and lines, dual to points A,B,... are 
marked a, b, ... . Points AD, FE, BC lie on the dual curve 7J , points AF, CD, BE 
on the curve 73 and points FC, AB, DE on the curve 73 . Let us call a configuration 
of 6 lines and their 9 intersection points, as in Figure 18.5, a Pappus configuration. 

We should like to know for which triples of curves 7J , 72 and 73 one has a 
Pappus configuration inscribed into these three curves. A sufficient (and necessary 
- but we shall not prove it) condition is that 7i , 72 and 73 are all parts of the same 
cubic curve. 
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B 



AB 



A 



FC 



DE 



FIGURE 18.5. A closed hexagon and the projectively dual configuration 



Cubic curves are given by equations of degree 3 

P(x, y) = ax 3 + bx 2 y H h j = 



(all in all, 10 terms). Multiplying all the coefficients by the same factor, yields the 
same curve. There is a unique cubic curve through a generic collection of 9 points. 
Cubic curves have numerous interesting properties. The one that we need is as 
follows. 

Theorem 18.1. Consider two triples of lines intersecting at 9 points. If a 
cubic curve T passes through 8 out of these points then it also passes through the 
9-th point (see Figure 18.6). 

Proof. Denote the intersection point of line Li = with Rj = by Aij ; here Li 
and Rj are linear equations of the lines. Assume that all points, except possibly A 22 , 
lie on r. Let P(x, y) = be an equation of T and set: C = LqLiL 2 , 7£ = RvtR\R 2 . 

We claim that P = XC + 11R for some coefficients A and \i. This would imply 
that P — at point A 22 since L 2 and R 2 both vanish at this point. 

Choose coordinates so that L (x,y) = x and Ro(x,y) = y (this involves only 
an affinc change of coordinates, so a cubic curve remains cubic). Then C — x(x — 
ai — b\y){x — a 2 — b 2 y) where a\ and a 2 are the x-coordinates of the points A w and 
A 2 o- Since P = at points A o, Aw and A 20 , one has P(x, 0) = Xx(x — a\)(x — a 2 ) 
for some constant A, that is, P(x,Q) — X£(x,y). Likewise, P(Q,y) — nlZ(0,y). 

Consider now the polynomial Q = P — XC — [i1Z. One has: Q(x, 0) = Q(0, y) = 
0, hence Q(x, y) — xyH(x, y) where if is a linear function. Note that Q vanishes at 
points An,Ai 2 and A 21 but xy does not vanish at these points. Therefore H = 
at these three non-collinear points. This implies that H — identically, and so is 
Q. Therefore P = XC + [iK. □ 

Let us summarize: a rectilinear 3- web is hexagonal if it consists of three families 
of tangent lines to a curve whose dual is a cubic. 

For example, consider the semicubic parabola y 2 = x 3 . The dual curve is a 
cubic parabola (see Lecture 8), and hence the 3-web in Figure 18.7 is hexagonal. 
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Figure 18.6. Illustration of Theorem 18.1 



18.5 Pappus and Pascal. Two particular cases of Theorem 18.1 are espe- 
cially worthwhile to mention. The first concerns the case when a cubic curve T 
consists of three lines. Then we obtain the celebrated Pappus theorem depicted in 
Figure 18.8. 

The dual curve to the union of three lines is degenerate and consists of three 
points. The respective 3-web is the one in Figure 18.3. Our earlier proof that this 
web is hexagonal implies, via duality, the Pappus theorem. 

Another particular case is when T consists of a line and a conic. We obtain the 
celebrated theorem of Pascal (1640) illustrated in Figure 18.9. 

18.6 Addition of points on a cubic curve. Theorem 18.1 is intimately 
related to a remarkable operation of addition of points on a cubic curve. This 
operation is defined geometrically, based on the property that a line that intersects 
a cubic curve twice will intersect it once again. 

Here is the definition. Let L be a non-singular cubic curve. Choose a point E 
that will play the role of the zero element. Given two points, A and B, construct 
the third intersection point D of the line AB with L; connect D to E and let C be 
the third intersection point of this line with L. By definition, A + B = C. 

Let us work out an example: L is the graph y = x 3 and E is the origin. Let 
points A, B and D have the first coordinates X\, x 2 and x 3 , see Figure 18.10. These 
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Figure 18.7. A hexagonal 3- web consisting of the tangent lines 
to a semicubic parabola 




Figure 18.8. The Pappus theorem 




Figure 18.9. The Pascal theorem 
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X 



Figure 18.10. An example of addition of points on a cubic curve 
points arc collincar, therefore 



which implies x\ + xix 3 + x\ = x\ + X2X3 + x\ or [x\ — x<i){x\ + X2 + £3) = 0, and 
finally, x x + x 2 + x 3 = 0. Point C is centrally symmetic to D, so its first coordinate 
is — a; 3 = x\ + X2- Thus the addition of points on this cubic curve amounts to the 
usual addition of the first coordinates. 

The addition of points on a cubic curve is commutative and associative. The 
former is obvious: the line AB coincides with the line BA. The latter follows from 
(and is equivalent to) Theorem 18.1, as illustrated in Figure 18.11. This theorem 
implies that the intersection point of the lines connecting A to B + C, and A + B 
to C, lies on V (this is point X in the figure). It follows that 



the desired associativity. 

18.7 In space. We mentioned in Section 18.2 that every 2-web in the plane 
is trivial. Not so in space! 1 A trivial 2-web in space is the one made of two families 
of lines, parallel to the x and y axes, and every 2-web that can be deformed to this 
one. 

To construct an obstruction to triviality of a 2-web in space is even easier than 
in the plane. Take a point A, draw through it the curve from the first family, choose 
a point B on this curve, draw through it the curve from the second family, choose 
a point C on this curve. Now draw through C the curve from the first family and 
through A the curve from the second family, see Figure 18.12. Will these curves 
intersect? Yes, if the web is trivial, and no, in general. 

Here is an example of a non-trivial 2-web. The first family consists of the 
vertical lines and the second of horizontal ones. The horizontal lines in the plane at 





xi - x 3 



X2 - X 3 



A + (B + C) = (A + B) + C = Y, 



1 Every 1-web in space is trivial. 
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FIGURE 18.11. Associativity of addition 




FIGURE 18.12. This quadrilateral may fail to close up 

height h are parallel to each other and have slope h in this plane. In other words, 
take a horizontal plane with a family of parallel lines on it and move it along the 
vertical axis, revolving about this axis with a positive angular speed. This screw 
driver motion yields the web depicted in Figure 18.13. 

Non-triviality of this 2- web has an interesting consequence. The two lines 
through point x span a plane, say, tt(x). We obtain a family of planes in space. 
These planes enjoy the property called complete non-integrability: there does not 
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I 

Figure 18.13. A non-trivial 2- web in space 

exist a surface (no matter how small) for which ir(x) would be the tangent plane 
at every point x. Indeed, if such a surface existed then the quadrilateral in Figure 
18.12 would lie on it and be closed. 

At first glance, this non-integrability is rather surprising: it contradicts our 
intuition developed in the plane case. In the plane (and in space of any dimension), 
if one has a direction at every point, smoothly depending on the point, then there 
exists a family of smooth curves, everywhere tangent to these directions - see Figure 
18.14 (this is the Fundamental Theorem of Ordinary Differential Equations). A 
completely non-integrable field of planes in space is called a contact structure, this 
a very popular object of study in contemporary mathematics. 




Figure 18.14. A family of directions integrates to a family of curves 



18.8 Chebyshev nets. Can one model fabric by a web? 

A flat piece of fabric is woven of two families of non-stretchable threads making a 
rectangular grid. Drape this piece of fabric over a curved surface, and the rectangles 
will get distorted to elementary parallelograms, see Figure 18.15. 

A Chebyshev net is a 2-web such that the lengths of the opposite sides of every 
quadrilateral, made by a pair of curves from each family, arc equal, sec Figure 18.16. 

Pafnuty Chebyshev, a prominent Russian mathematician of the 19-th century, 2 
was motivated by an applied problem: how to cut fabric more economically (he 



2 The reader of Lecture 7 is familiar with some other works of Chebyshev. 
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FIGURE 18.15. A piece of fabric 




AD = BC 
AB = DC 



Figure 18.16. Chebyshev net 



was working for a private client, an owner of a textile business). This was an acute 
problem: with the onset of the Crimean War, there was a huge demand for army 
uniforms. 

Here is a construction of a Chebyshev net in the plane. Start with two curves, a 
and b, intersecting at the origin O. For every point A on curve a, parallel translate 
curve b through vector OA. Likewise, for every point B on b, translate a through 
vector OB. The result is a 2- web, and this is a Chebyshev net. Indeed, the 
quadrilateral OBCA in Figure 18.17 is a parallelogram. Therefore the curve BC 
is a parallel translate of the curve OA, and likewise for the curves AC and OB. 
Hence the opposite sides of the curvilinear quadrilateral OBCA are equal. 

One can say more: the curves a and b uniquely determine a Chebyshev net. 
To see this, let us approximate the curves a and b by polygonal lines, say, with 
sides of length e. If an elementary quadrilateral of a Chebyshev net has rectilinear 
sides then it is a parallelogram. It follows that the polygonal lines a and b uniquely 
determine a family of parallelograms, see Figure 18.18. In the limit e — > 0, one 
obtains a Chebyshev net generated by the curves a and b. 

To obtain a Chebyshev net on a curved surface, one can draw a planar Cheby- 
shev net on a sheet of paper and then bend the sheet into a "developable" surface 
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/ / 



FIGURE 18.17. A construction of a Chcbyshcv net in the plane 




FIGURE 18.18. A Chebyshev net made of parallelograms 

(see Lecture 13); for example, one can consider a surface made of a sheet of a graph 
paper. However, Chebyshev nets exist not only on developable surfaces. 

Here is a more general construction of a Chebyshev net, this time, on a curved 
surface. Let a and b be two curves in space. For every pair of points, A on a and B 
on b, let C be the midpoint of the segment AB. The locus of points C is a surface. 
This surface is made of two families of curves: these curves are obtained by fixing, 
in the above construction, point A (first family) or point B (second family). This 
is a Chebyshev net. 

Indeed, consider two pairs of points: A and A' on curve a, and B and B' on 
curve b, see Figure 18.19. The midpoints of the four segments, K,L,M,N, lie on 
the surface. The pairs K, L and M, N lie on two curves from one family, and the 
pairs K, N and L, M on two curves from another family. One has: 

KL = -AA' = NM, KN = -BB' = LM, 
2 ' 2 

and hence curve NM is obtained from curve KL by parallel translation through 
vector KN; and likewise for curves LM and KN. It follows that the lengths of the 
opposite sides of the curvilinear quadrilateral KLMN are equal. 

The surface itself (the locus of midpoints C) is the result of parallel translation 
of curve a along curve b. Such surfaces are called translation surfaces. If curves 
a and b lie in one plane, the translation surface coincides with this plane, and the 
constructed Chebyshev net is the one in Figure 18.17. 
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Figure 18.19. A construction of a Chcbyshcv net in space 



A familiar example of a translation surface is a hyperbolic paraboloid z = 
x 2 — y 2 ; the respective curves are the parabolas (2x,0,2x 2 ) and (0, 2y, — 2y 2 ) (see 
Figure 18.20, left); a circular paraboloid z — x 2 + y 2 provides one more example 
(see Figure 18.20, right). 




Figure 18.20. Quadratic surfaces as translation surfaces 
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Returning to the difference between developable surfaces and surfaces with 
Chebyshev nets, one can notice that while the former are images of maps of the 
plane to space preserving lengths of all smooth curves, the latter are described by 
maps which preserve only the lengths of vertical and horizontal lines. Less formally, 
one can say that a surface is developable if one can tightly attach to it a piece of 
paper, while it admits a Chebyshev net, if one can tightly attach to it a fishing 
net. For example, the paraboloids of Figure 18.20 are not developable, but admit 
Chebyshev nets. 



John Smith 
January 23, 2010 



Martyn Green 
August 2, 1936 



Henry Williams 
June 6, 1944 



John Smith 
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18.9 Exercises. 

18.1. (a) Prove that the web made of horizontal lines, vertical lines and hyper- 
bolas xy — const is trivial. 

(b) Same for the web made of horizontal lines, vertical lines and the graphs 
y = f(x) + const where f(x) is any function with positive derivative. 

18.2. Consider the 3- web made of the lines through one fixed point, the lines 
through another fixed point and of tangent half-lines to a fixed circle. Is this web 
trivial? 

18.3. Prove that the 3-web made of the tangent lines to a fixed circle and the 
lines through a fixed point is hexagonal. 

18.4. Consider a triangle made of curves of a 3-web. Show that there exists a 
unique inscribed triangle, made of curves of this web. 



18.5. * Prove that a hexagonal 3-web is trivial. 

Hint. Extend the hexagon to a "honeycomb" as in Figure 18.21. One can 
change coordinates so that the honeycomb is made of three families of parallel 
lines. Making the original hexagon smaller and smaller, one deforms, in the limit, 
a hexagonal 3-web into a trivial one. 
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18.6. * Prove that the three inflection points of a smooth cubic curve lie on a 
straight line (this is clearly seen in Figure 18.6). 




LECTURE 19 
The Crofton Formula 

19.1 The space of rays and the area form. The Crofton formula concerns 
the set of oriented lines in the plane. Sometimes, as in geometrical optics, we shall 
think of oriented lines as rays of light and call them rays. 

An oriented line is characterized by its direction ip and its distance p from the 
origin O. This distance is signed, see Figure 19.1. Thus the space of rays is a 
cylinder with coordinates (<p,p) (which we visualize as the vertical unit circular 
cylinder in space). 




Figure 19.1. Coordinates in the space of oriented lines 

Changing the orientation of a line is the central symmetry of the cylinder: 
(if, p) i— > (ip + 7r, —p). It follows that the set of non-oriented lines identifies with the 
quotient of the cylinder by this central symmetry, that is, the Mobius band. 

Translating the origin changes the coordinates of a line. Namely, if O' = 
O + (a, b) is a different choice of the origin then the new coordinates depend on the 
old ones as follows: 

(19.1) ip' = p, p' = p — asm ip + 6cos ip. 

The reader may look up a proof of this formula in Lecture 10 or do Exercise 19.1. 
In particular, the set of rays through point (a, b) is given by the equation p — 
— a sin ip + b cos ip. This is a plane section of our cylinder. 
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The cylinder has the area element dipdp; this area form is the main character of 
this lecture. It does not change under isometries of the plane. Indeed, an isomctry 
is a composition of a rotation about the origin and a parallel translation. A rotation 
through angle a acts on the space of rays as follows: 

ip' = if + a, p' = p. 

This is a rotation of the cylinder that does not distort the area. A parallel trans- 
lation acts according to formulas (19.1). This does not change the area either, see 
Figure 19.2. 



P 




Figure 19.2. Parallel translation preserves the area form in the 
space of oriented lines 

It is worth mentioning a fact known to Archimedes. Inscribe a unit sphere 
into the cylinder and consider the axial projection from the sphere to the cylinder 
(which is not defined at the poles). This projection is area-preserving: the areas of 
any domain on the sphere and the cylinder are equal, see Exercise 19.2. This fact 
makes it possible to immediately find the area of the sphere. 

The axial projections are used in cartography where they are known under 
a number of names: Gall-Peters, Bchrmann, Lambert, Balthasart, etc. They do 
not distort areas, so Greenland appears approximately 13 times as small as Africa 
(unlike the Mercator projection where they appear roughly of the same areas) but 
they severely distort distances, especially, near the poles. See [90] for the history 
of cartography. 

The fact that the axial projection from the unit sphere to the cylinder is area- 
preserving implies that the area of a spherical belt (the domain of the sphere be- 
tween two parallel planes) depends only on the height h of the belt (that is, h is 
the distance between the planes); namely, this area equals 2nh. This high school 
geometry fact has a curious consequence. 

Let the unit disc be covered by a collection of strips with parallel sides ( "planks" ) . 

Theorem 19.1. The sum of the widths of the strips is not less than 2. 

This is the Tarski plank theorem (its statement is of course obvious if the strips 
are parallel). 
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Proof. Consider the disc covered by strips as the vertical projection of the unit 
sphere covered by spherical belts. The total area of the belts is 2ir times the sum 
of their widths, and this is not less than the area of the sphere, 47r. Thus the sum 
of the widths is not less than 2. □ 

19.2 Relation to geometrical optics. The area form dipdp on the space 
of rays plays a role in geometrical optics. An ideal mirror, in dimension 2, is 
represented by a plane curve; the law of reflection is "the angle of incidence equals 
the angle of reflection" . 

Thus a mirror determines a (partially defined) transformation of the space of 
rays: an incoming ray is sent to the reflected, outgoing one. This is a transformation 
of the cylinder, and its crucial property is that it is area-preserving. The same holds 
for more complicated optical systems involving a number of mirrors and lenses. Sec 
Lecture 28 for a detailed discussion of this area preserving property in the framework 
of billiards. 

By the way, the existence of an area form on the space of rays, invariant under 
mirror reflection, is not specific to the plane. Consider, for example, the unit sphere. 
The role of lines is played by great circles. An oriented great circle is uniquely 
characterized by its pole, a center of this great circle in the spherical metric (it is a 
matter of convention which of the two poles to choose, but once made, this choice 
should be consistent; in particular, changing the orientation of a great circle, one 
chooses the opposite pole). 

Thus the space of rays on the sphere is identified with the sphere itself (this 
construction almost coincides with the projective duality discussed in Lecture 8). 
The sphere has a standard area element, and this provides an area clement on the 
space of rays. This area form is invariant under the motions of the sphere (Exercise 
19.3) and does not change under a reflection in any mirror, represented by a smooth 
spherical curve. 

19.3 The formula. Consider a smooth plane curve 7 (not necessarily closed 
or simple), and define a function n 7 on the space of oriented lines as the number of 
intersections of a line with the curve. There are some problems with this definition: 
for example, if 7 is a straight segment then n 7 will have an infinite value for the two 
oriented lines that contain this segment. Not to worry: we are going to integrate 
n 1 over the cylinder, and these "abnormalities" will not contribute to the integral. 

Generically, the value of n 7 changes by 2 when the lines becomes tangent to 
the curve 7, see Figure 19.3. If (<p,p) are the coordinates of the line we write the 
function as n^(ip,p). 

Crofton's formula is the following statement. 

Theorem 19.2. 



If the curve 7 is bounded, that is, lies in a disc of some radius r, centered at 
the origin, then n 7 ((p,p) = for \p\ > r, and hence the integral has a finite value. 

Proof of Crofton's formula. The curve 7 can be approximated by a polygonal 
line, and it suffices to prove (19.2) for such a line. Suppose that a polygonal line 
is the concatenation of two, 71 and 72. Both sides of (19.2) are additive, and the 



(19.2) 
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Figure 19.3. The function n 7 



formula for 7 would follow from those for 71 and 72. Hence it suffices to establish 
(19.2) for a segment. 

Let C be the value of the integral (19.2) for a unit segment; the constant does 
not depend on the position of the segment because the area form on the space of 
lines is isometry invariant. A dilation by a factor r multiples the area form by r, 
therefore 



for every segment 7. 

It remains to check that C = 4. This is most easily seen when 7 is the unit 
circle centered at the origin: the length is 2tt, while n 7 (ip,p) = 2 for all <p and 
— 1 < V < 1, an d zero otherwise. □ 

An analog of Crofton's formula holds for curves on the sphere: of course, the 
integral is taken with respect to the area element, discussed at the end of Section 
19.2, see Exercise 19.5. 

19.4 First applications. The Crofton formula has numerous applications. 
In this section, we shall discuss four. 

1). Consider two nested closed convex curves, 7 and T, sec Figure 19.4, and let 
I and L be their lengths. We claim that L > I. Indeed, a line intersects a convex 
curve at two points, and every line that intersects the inner curve intersects the 
outer one as well. Hence nr > n 7 , and the result follows from the Crofton formula. 





r 



Figure 19.4. Nested ovals 
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2). The width of a convex figure in a given direction is the distance between 
two lines in this direction, tangent to the figure on the opposite sides. A figure of 
constant width has the same width in all directions. An example is a circle. 

There are many other figures of constant width, see Figure 19.5. What they 
have in common is their perimeter length, equal to ird, where d is the width. Let 
us prove this claim. 



Let 7 be a closed convex curve of constant width d. Choose an origin inside 7. 
Consider the tangent line to 7 in the direction ip and let p(<p) be its distance from 
the origin. The periodic function p(ip) is called the support function of the curve, 
see Lecture 10. 

The constant width condition is: p(ip) + p(tp + 7r) = d. By the Crofton formula, 



as claimed. 

3). The distance between the lines on a ruled paper is 1. What is the probability 
that a unit length needle, randomly dropped on the paper, intersects a line? This 
is the celebrated Buffon's needle problem. 

Assume that the unit segment, 7, is horizontal and centered at the origin, while 
the ruled paper may assume all possible positions. Now, instead of the ruled paper, 
consider just one line at distance at most 1/2 from the origin. Then all possible 
positions of the line is the rectangle < ip < 2ir, —1/2 < p < 1/2 whose area is 27r. 

If a line intersects 7 then n 7 = 1, otherwise n 7 = 0. Therefore the desired 
probability equals 



By Crofton's formula, the integral is 4 times the length of 7, and the probability 
equals 2/tt. 

4) . The curvature of a smooth space curve is the magnitude of the acceleration 
vector if the curve is given an arc length parameterization, see Lecture 15. 

Theorem 19.3. The total curvature of a closed space curve is at least 2ir. 




Figure 19.5. A figure of constant width 
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Proof. Let j(t) be the curve. As the parameter t varies, the velocity vector 
r(i) = 7'(i) describes a closed curve on the unit sphere (sometimes called the 
tangent indicatrix, or simply the tantrix). The length of the tantrix is 



that is, the total curvature of 7. We want to show that the length of V is not less 
than 2ir. 

We claim that the tantrix intersects every great circle at least twice. All great 
circles being equal, take the equator. Consider the highest and the lowest points of 
7. At these points, the velocity 7' is horizontal, and hence T intersects the equator. 

Finally, apply the spherical Crofton formula: n r > 2 everywhere and the total 
area of the sphere is An, therefore length (T) > 2tt. □ 



Figure 19.6. A curve with one local maximum and one local 
minimum cannot be knotted 

The celebrated Fary-Milnor theorem says more: 

Theorem 19.4. // a closed spacial curve is knotted then its total curvature is 
greater than An. 

That the total curvature is not less than An follows from the more-or-less ob- 
vious fact that a knot must have at least two local maxima and two local minima, 
sec Figure 19.6. We do not dwell on how to make this inequality strict. 
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19.5 The DNA geometric inequality. Consider again two plane closed 
smooth nested curves: the outer one, T, is convex and the inner one, 7, is not 
necessarily convex and may have self-intersections. The picture resembles DNA 
inside a cell, see Figure 19.7. 




Figure 19.7. DNA inside a cell 



Define the total absolute curvature of a closed curve as the integral of the 
absolute value of the curvature with respect to the arc length parameter. Total 
absolute curvature is the "total turn" of the curve. The average absolute curvature 
of a curve is the total absolute curvature divided by the length. 

One has the following geometrical inequality 

Theorem 19.5. The average absolute curvature of T is not greater than the 
average absolute curvature of 7. 

We call this the DNA geometric inequality. This theorem was proved only 
recently [49, 54], and the proof is surprisingly hard. We shall prove a weaker 
result, due to Fary (of the Fary-Milnor theorem fame; this theorem was mentioned 
in Section 19.4). Namely, we assume that the outer curve, T, has a constant width 
(for example, a circle). 

Proof. We already know that the length of T is ird, where d is the diameter, 
and its total curvature is 2-k. Denote the total curvature of 7 by C, and let L be 
its length. We want to prove that 



(19.3) 



£> 2 
L ~ d 

Give 7 an orientation and define a locally constant function q(ip) on the circle as 
the number of oriented tangent lines to 7 having direction p. One has the following 
integral formula for the total absolute curvature: 

(19.4) C = / q(<p) dep. 



Indeed, if t is the arc length parameter on 7 and p the direction of its tangent line 
then the curvature is k — d<p/dt. The total curvature 

\k\dt= I ^ dt, 



I 

Jo 



r 


dip 


'0 


dt 



is the total variation of ip. Let / be an interval of ip for which the function q(<p) has 
a constant value, say, m. Then I has m pre-images under the function p(t), and 
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the value of the integral 




over each of these m intervals is the length of /. This implies formula (19.4). 
Let us use Crofton's formula to evaluate L. The crucial observation is that 



for all p, ip. Indeed, between two consecutive intersections of 7 with a line whose 
coordinate are (<p,p), the tangent line to 7 at least once has the direction of ip or 
ip + tt (this is, essentially, Rollc's theorem), see Figure 19.8. 



Denote the support function of T by p(<p). It remains to integrate (19.5), taking 
into account (19.4) and that p(<p) + p(p + tt) = d: 



This implies the desired inequality (19.3). □ 

We can add that equality in (19.3) implies that 7 coincides with T, possibly 
traversed more than once. This follows from our proof as well. 

It is interesting to investigate the spatial version of the DNA inequality when 
the cell is a convex body containing a closed curve. If the cell is a ball one has 
essentially the same result, but nothing is known for cells of more general shapes. 

19.6 Hilbert's Fourth problem. In his famous talk at the International 
Congress of Mathematicians in 1900, D. Hilbcrt formulated 23 problems that would 
greatly influence the development of mathematics in the 20-th century and are likely 
to continue to inspire mathematicians. The Fourth problem asks to "construct and 
study the geometries in which the straight line segment is the shortest connection 
between two points" . In this section, we shall show how Crofton's formula yields a 
solution to Hilbcrt's Fourth problem in dimension two. 

First of all, what docs one means by "geometry"? Geometrical optics sug- 
gests the following answer. Consider propagation of light in an inhomogeneous and 
anisotropic medium. This means that the speed of light depends on the point and 
the direction. One may define "distance" between points A and B as the shortest 



(19.5) 



n-y{ip,p) < q((p) + q((p + tt) 




Figure 19.8. A version of Rolle's theorem 
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time it takes light to get from A to B. This defines geometry, called Finsler ge- 
ometry, and the trajectories of light will be the analogs of straight lines (they are 
called geodesies). We want these geodesies to be straight lines. 

The speed of light at every point x can be described by the "unit circle" at x 
consisting of unit velocity vectors at this point. This "unit circle", called the indi- 
catrix, is a smooth convex centrally symmetric curve, centered at x. For example, 
in the standard Euclidean plane, all indicatrices arc unit circles. If all indicatrices 
are ellipses, the geometry is called Riemannian (this is the most important and 
thoroughly studied class of geometries). 

Let us start with examples satisfying Hilbert's requirement. The very first one, 
of course, is the Euclidean metric in the plane. Next, consider the unit sphere 
with its standard geometry in which the geodesies are great circles. Project the 
sphere on some plane from the center; this central projection identifies the plane 
with a hemisphere, and it takes great circles to straight lines. This gives the plane 
a geometry, different from Euclidean, whose geodesies are straight lines (for the 
reader, familiar with differential geometry, this is a Riemannian metric of constant 
positive curvature). 

The next example features hyperbolic geometry whose discovery was one of the 
major achievements of 19-th century mathematics. Consider the unit disc in the 
plane and define the distance between points x and y by the formula: 

(19.6) d(x, y) = ln[o, x, y, b] 

where a and b are the intersection points of the line xy with the boundary circle, 
see Figure 19.9, and [a,x,y, b] is the cross-ratio of four points defined as 

. (a — y)(x — b) 

[a,x,y,b] = ) -(. 

(a-x){y-b) 

This is the so-called Beltrami- Klein (or projective) model of the hyperbolic plane. 




Figure 19.9. The Beltrami-Klcin model of the hyperbolic plane 

In fact, it was well known by the time of Hilbert's talk that the only Riemann- 
ian geometries whose geodesies are straight lines are the Euclidean, spherical and 
hyperbolic ones (Beltrami's theorem). 

Posing his problem, Hilbert was motivated by two other examples. One was 
discovered by Hilbert in 1894, and it is called the Hilbert metric. The Hilbert 
metric is a generalization of the Klein-Beltrami model with the unit disc replaced 
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by an arbitrary convex domain. The distance is given by the same formula (19.6) 
but, unless the boundary curve is an ellipse, this Finsler metric is not Riemannian 
anymore. Another example was studied by H. Minkowski in the framework of 
number theory In Minkowski geometry, the indicatrices at different points are 
identified by parallel translations. This is a homogeneous but, generally, anisotropic 
geometry. 

The solution to Hilbcrt's Fourth problem is based on the Crofton formula. Let 
f(j>, <p) be a positive continuous function on the space of rays, and even with respect 
to the orientation reversion of a line: f(—p, <p + it) = f(p, <p). Then one has a new 
area form on the space of rays f(p, ip) d<p dp. Define the length of a curve by the 
formula 



We obtain a geometry in the plane, and we claim that its geodesies are straight 
lines. To see that this is the case one needs to check the triangle inequality: the 
sum of the lengths of two sides of a triangle is greater than the length of the third 
side. Indeed, every line intersecting the third side also intersects the first or the 
second, and integration (19.7) yields the desired triangle inequality. 

In fact, every Finsler metric whose geodesies are straight lines is given by 
formula (19.7), but we do not prove this fact here. This means that in each such 
geometry one has a version of the Crofton formula. In higher dimensions, Hilbert's 
Fourth problem has a similar solution; instead of the space of lines one utilizes the 
space of hyperplanes and a version of Crofton's formula therein. 

To conclude our brief discussion of Hilbcrt's Fourth problem, let us mention 
an elegant description of metrics whose geodesies are straight lines. This is due to 
Hilbcrt's student, Hamel, and was obtained in 1901, shortly after Hilbert's talk. 

One may characterize a Finsler metric by a function which, following the tradi- 
tion in physics, we call the Lagrangian and denote by L. Given a velocity vector v 
at point x, the value of the Lagrangian L(x, v) is the magnitude of the vector v in 
units of the speed of light, that is, the ratio of v to the speed of light at this point 
and in this direction. Said differently the indicatrix at point x consists of the ve- 
locity vectors v satisfying the equation L(x, v) = 1. Clearly, L(x, v) is homogeneous 
in the velocity: L(x,tv) — tL(x,v) for all positive t. 

For example, in Euclidean geometry, L(x,v) — \v\, the Euclidean length of the 
vector. In Minkowski geometry, the Lagrangian L(x, v) does not depend on x. 

For a smooth curve 7(i), its length is given, in terms of the Lagrangian, by the 
formula 



Due to the fact that the Lagrangian is homogeneous of degree one, this integral 
does not depend on the parameterization (as every student of calculus learns when 
studying line integrals). 

One may recover the Lagrangian from formula (19.7) by applying it to an 
infinitesimal segment 7 = [x,x + ev]. The result is the following formula: 



(19.8) L(xi, x 2 ,vi,v 2 ) = — / \vi cos a + v 2 sin a\ f(xi cos a + x 2 sin a, a) da, 



(19.7) 






see Exercise 19.11. 

We are ready to formulate Hamel's theorem. 
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Theorem 19.6. A Lagrangian L(xi,X2,v\,V2) defines a Finsler metric whose 
geodesies are straight lines if and only if 

d 2 L 8 2 L 
dx\dv2 dx2dv\ 

The explicit formula (19.8) provides a solution to this partial differential equa- 
tion. 

For more on Hilbert's Fourth problem, see Busemann's contribution in [93], 
the books [61, 91] and the article [3]. 
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19.7 Exercises. 

19.1. Prove formula (19.1). 

19.2. Prove that the axial projection from a sphere to the circumscribed cylin- 
der is area-preserving (Archimedes). 

19.3. Prove that the area element on the space of oriented great circles on the 
sphere, defined in Section 19.2, is invariant under the rotations of the sphere. 

19.4. Make an explicit computation to verify Crofton's formula (19.2) for a 
unit segment. 

19.5. Prove Crofton's formula for the sphere. 

19.6. Let T be a closed convex curve and 7 a closed, possibly self-intersecting, 
curve inside T; let L and I be their lengths. Prove that there exists a line that 
intersects 7 at least [21 /L] times. 

19.7. (a) Replace each side of an equilateral triangle with the arc of a circle 
centered at the opposite vertex. Prove that the resulting convex curve has constant 
width (Reuleaux triangle). 

(b) Construct a similar figure of constant width based on a regular n-gon with 
odd n. 

(c) Prove that one can circumscribe a regular hexagon about a curve of constant 
width. 

Hint. Circumscribe a hexagon with angles 120° and show that the sum of any 
two consecutive sides is the same. Rotate this hexagon 60° and show that some 
intermediate hexagon is regular. 

(d) * Prove that the Reuleaux triangle has the smallest area among the figures 
of constant width. 

Hint. Circumscribe a regular hexagon about the curve of constant width 7 
and inscribe a Reuleaux triangle into this hexagon as well. Show that the support 
function of the Reuleaux triangle is not greater than that of 7 and use Exercise 



19.8. Prove that if one randomly drops a curve of length I on the ruled paper 
then the average number of intersections with a line equals 2Z/7T. 

19.9. Let 7 be a not necessarily closed curve inside a unit circle, C its total 
absolute curvature and L its length. Prove that L < C + 2. 

19.10. Formulate and prove a version of the DNA theorem for a curve inside a 
ball in space. 

19.11. * Let 7 be a closed curve in space and C its total curvature. For a unit 
vector v, consider the orthogonal projection on the plane along v. Let C v be the 
total absolute curvature of the plane projection of 7. Prove that 



where v is considered as a point of the unit sphere and the integration is with 
respect to the standard area element on the sphere. 

Hint: assume that 7 is a polygonal line and reduce to the case of a single angle. 

19.12. Prove the triangle inequality for Hilbert metric. 



10.4 (a). 
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LECTURE 20 

Curvature and Polyhedra 

20.1 In the plane. The curvature of a smooth plane curve is the rate with 
which the tangent line turns as one moves along the curve with unit speed. How 
does one define the curvature of a polygonal line? 

The curvature of a plane wedge a is defined as its defect, ir — a, the angular 
measure of the complementary angle. The more acute the angle is, the greater its 
curvature. The curvature of a polygonal line is the sum of the curvatures of its 
angles. 




Figure 20.1. Sum of exterior angles of a convex polygon 

The sum of curvatures of a convex polygon is 2ir, see Figure 20.1. The sum of 
curvatures of the two 7-pronged stars in Figure 20.2 are 47T and 6tt. 

20.2 Curvature of a polyhedral cone. Each face of a polyhedral cone 
has an angle subtended at the vertex; we call this angle a flat angle. Consider 
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Figure 20.2. Two 7-pronged stars 

a polyhedral cone with flat angles a%, . . . , a n . Define its curvature as the defect, 
27T — (ai + • • • + a n ). Note that curvature can be positive or negative; if the cone 
is flat then its curvature is zero. 

If the cone has more than three faces, it is flexible. Imagine that the faces, 
which are rigid plane wedges, are connected at the edges by hinges. Then one can 
deform the cone in such a way that each face is not stretched or compressed, but the 
dihedral angles vary. Under such a deformation, the curvature remains constant. 

Let P be a convex polyhedron. 

Lemma 20.1. The sum of curvatures of all vertices of P equals 477. 

Proof. Let v, e, / be the number of vertices, edges and faces of P. These num- 
bers satisfy the Euler formula: v — e + f = 2 (see Lecture 24 for a proof). 

Let us compute the sum S of all angles of the faces of P. At a vertex, the sum 
of angles is 27T minus the curvature of this vertex. Summing up over the vertices 
gives: 

S = 2irv - K 

where K is the total curvature. On the other hand, one may sum over the faces. 
The sum of the angles of the i-th face is ^(rn — 2), where rn is the number of sides 
of this face. Hence 

S = n(ni + ■ ■ ■ + rif) — 27r/. 
Since every edge is adjacent to two faces, n\ H + n/ = 2e; therefore 

S = 2776 - 277./. 

Thus 2irv — K = 2ire — 27r/, which, combined with the Euler formula, yields the 
result. □ 

An analog of Lemma 20.1, along with its proof, holds for non-convex polyhedra 
as well and even for other polyhedral surfaces not necessarily topologically equiv- 
alent to the sphere (for example, a torus): the total curvature equals 277% where 
X = v — e + f is the Euler characteristic. 

20.3 Dual cones and spherical polygons. Given a convex polyhedral cone 
C with vertex V, consider outward normal lines to its faces through V. These 
lines are the edges of a new convex polyhedral cone C* called dual to C. A similar 
construction in the plane yields an angle which complements the original angle to 77. 
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The relation between a cone and its dual in space is more complex and is described 
in the next lemma. 

Lemma 20.2. The angles between the edges of C* are complementary to ir of 
the dihedral angles of C, and the dihedral angles of C* are complementary to ir of 
the angles between the edges of C . 

Proof. The first claim is clear from Figure 20.3 and the second from the sym- 
metry of the relation between C and C* . □ 




Figure 20.3. Proving Lemma 20.2 

From this point on, our arguments involve spherical geometry. It is natural to 
replace the Euclidean plane by the (unit) sphere - after all, the surface of the Earth 
is (approximately) a sphere. 1 The role of straight lines is played by great circles 
(unlike the plane, any two such "lines" intersect at two points); a spherical polygon 
is bounded by arcs of great circles. The reader who wants to learn what are the 
counterparts of straight lines on arbitrary surfaces should wait until Section 20.8. 
The angle between two great circles, intersecting at point X, is defined as the angle 
between their tangent lines in the tangent plane to the sphere at X. 

A peculiar property of spherical geometry is the absence of similarities; in 
particular, one cannot dilate a polygon so that its angles remain the same but the 
area changes. More precisely, the area of a spherical polygon is determined by its 
angles. 

Theorem 20.1. Let P be a convex n-gon on the unit sphere, A its area and 
ai, . . . , a n its angles. Then 

(20.1) A = ai -\ h a n - (n - 2)tt. 

Note that, for a plane n-gon, the right-hand side of (20.1) vanishes. 

Proof. Let us start with n — 2. A 2-gon is a domain bounded by two meridians 
connecting the poles. If a is the angle between the meridians, then the area of the 
2-gon is the (a/27r)-th part of the total area 4tt of the sphere, that is, 2a. 



1 Spherical geometry was already known to the Ancient Greeks. Many results of plane geom- 
etry have spherical analogs, for example, there are spherical Laws of Sines and Cosines. 
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FIGURE 20.4. Area of a spherical triangle 

Next, consider a spherical triangle; see Figure 20.4. The three great circles form 
six 2-gons that cover the sphere. The original triangle and its antipodal triangle 
are covered three times, and the rest of the sphere is covered once. The total area 
of the six 2-gons equals 2(2ai + 2a 2 + 2a 3 ); hence 

4(ai + a 2 + a 3 ) = 4ir + 4A. 

This implies the statement for n = 3. 

Finally, every convex n-gon with n > 4 can be cut by its diagonals into n — 2 
triangles. The area and the sum of angles are additive under cutting, and (20.1) 
follows. □ 

As a consequence, we interpret the curvature of a convex polyhedral cone C 
as the area of a spherical polygon. Let C* be the dual cone and consider the unit 
sphere centered at its vertex. The intersection of C* with the sphere is a convex 
spherical polygon P. The area of P measures the solid angle of the cone C* . 

Corollary 20.2. The area A of the spherical polygon P equals the curvature 
of the cone C. 

Proof. Assume that P is n-sided and let a, be its angles. The area of P is given 
by formula (20.1). 

The angles a* are the dihedral angles of C* . Let fa be the angles between the 
edges of the cone C. By Lemma 20.2, Qj = ir — fa. Substitute into (20.1) to obtain: 

A = 2ir-(0 1 + ---+0 n ). 

The right hand side is the curvature of the cone C, as claimed. □ 

Corollary 20.2 provides an alternative proof of Lemma 20.1: one may translate 
the dual cones at all the vertices of P to the origin, and then the cones will cover the 
whole space, see Figure 20.5. It follows that the sum of the areas of the respective 
spherical polygons is 4tt, and Lemma 20.1 follows. This alternative proof, combined 
with the argument of Lemma 20.1, implies Eulcr's formula as well. 
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Figure 20.5. Sum of curvatures of a convex polyhedron 

20.4 Parallel translation and rolling. Let us define parallel translation on 
a polyhedral surface P. What we want to translate is a vector, say v, that lies 
in one of the faces of the polyhedron. As long as one stays within one face, one 
parallel translates v, just as in the plane. The question arises when one wants to 
carry the vector over an edge. 

Let Fi and F 2 be adjacent faces that share an edge E. Identify the planes of 
the two faces by revolution about E (as if they were connected by hinges). Let a 
tangent vector v be parallel translated inside F\. When the foot point of v reaches 
E, apply this rotation to obtain a vector u that lies in F 2 . Vector u is the result of 
parallel translating v across edge E. 

Said differently under the parallel translation of v across edge E, the tangential 
component of v along E and its normal component remain the same. See Figure 
20.6, featuring parallel translation of a vector across three adjacent edges of a cube. 




Figure 20.6. Parallel translation on a cube 

Equivalently, position P so that the face Fx is on the horizontal plane and roll 
it across edge E. Now the face F 2 is on the horizontal plane. The prints of the 
vectors v and u on the horizontal plane are parallel. Thus parallel translation on 
the surface of a polyhedron is the same as rolling this polyhedron on the horizontal 
plane. 

A geodesic 7 is defined as a curve on a polyhedral surface which is straight 
within each face and whose tangent vectors are parallel translated across each edge 
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intersected by 7. We assume that geodesies do not pass through vertices. A realistic 
image of a geodesic is a ribbon wrapped around a box of chocolate. When rolling 
a polyhedron in the plane, a geodesic leaves a straight trace. 

Geodesies minimize the distance between their sufficiently close points. 

Lemma 20.3. Consider two planes in space, F\ and F 2 , intersecting along a 
line E, and let A\ and A 2 be points in F\ and F 2 . Let 7 be the shortest path from 
Ai to A 2 across the edge E on the surface made by two half-planes, separated by E 
. Then 7 is a geodesic. 

Proof. Turn the plane F 2 around the edge E until it coincides with the plane 
F\. The shortest curve 7 from A\ to A 2 unfolds to a straight segment, therefore 
the unit tangent vector to 7 is parallel translated across E. □ 

20.5 Gauss-Bonnet theorem. Let V be a vertex of a polyhedral cone C. 
Consider a vector that lies in one of the faces of the cone. Choose a closed path on 
the surface of the cone, starting at the foot point of the vector and going around 
V once counterclockwise, and parallel translate the vector along the path. What 
happens? The foot point returns to the initial position, and the vector turns through 
some angle a. This angle depends neither on the choice of the vector nor on the 
path. What is this angle? 

Lemma 20.4. The angle a equals the curvature of C . 

Proof. Instead of parallel translating, put C on the horizontal plane and roll 
it across consecutive edges. The resulting unfolding of the cone is a plane wedge 
whose measure is the sum of flat angles of C. The angle in question complements 
this sum to 2ir, and the result follows; see Figures 20.6 and 20.7. □ 



V 




FIGURE 20.7. Proving Lemma 20.4 

More generally, let 7 be an oriented simple closed path on a polyhedral surface 
P; we assume that 7 intersects the edges transversally and avoids the vertices. The 
curve 7 partitions P into two components, one on the left and one on the right of 
the curve. We refer to the former as the domain bounded by 7. Choose a tangent 
vector v with foot point on 7 and parallel translate it along 7. Let u be the final 
vector, whose foot point coincides with that of v; denote by 0(7) the angle between 
v and u. The next result is (a polyhedral version of) the celebrated Gauss-Bonnet 
theorem. 

Theorem 20.3. The angle a(j) equals the sum of curvatures of the vertices of 
P that lie in the domain bounded by 7. 
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Proof. Let us argue inductively on the number n of vertices inside 7. If n = 1, 
this is Lemma 20.4. 




Figure 20.8. Proving the Gauss-Bonnet theorem 



If n > 1, one may cut the domain bounded by 7 by an arc 6 into two domains, 
each with fewer than n vertices; see Figure 20.8. Let 71 be the path that follows 
7 from A to B and then 5 from B to A. Likewise, 72 is the path that follows 5 
from A to B and then 7 from B to A. The concatenation of 71 and 72 differs 
from 7 by the arc S, traversed back and forth. Hence the contribution of 6 cancels, 
01(7) = 0(71) + 01(72)) an d the result follows by induction. □ 



20.6 Closed geodesies on generic polyhedra. Figure 20.9 depicts simple 
closed geodesies on a regular tetrahedron and a cube. The former is the section of 
the tetrahedron by a plane parallel to a pair of pairwise skew edges, and the latter 
is the section of the cube by a plane perpendicular to its great diagonal. 




FIGURE 20.9. Closed geodesies on polyhedra 



Can one find such a geodesic on a generic closed convex polyhedron PI What 
we mean by "generic" is that the only linear relation, with rational coefficients, 
between the curvatures of the vertices and it is the one given by Lemma 20.1. 



Theorem 20.4. There exist no simple closed geodesies on P. 
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Proof. Assume there is such a geodesic 7. Then the unit tangent vector to 7 
is parallel translated along 7. In particular, this tangent vector returns, without 
rotation, to the initial point. 

On the other hand, by the Gauss-Bonnet theorem, parallel translation along 7 
results in rotation through the angle equal to the sum of curvatures of the vertices 
inside 7. This set of vertices is a proper and non-empty subset of the set of vertices 
of P. Since P is generic, the sum of curvatures cannot be a multiple of 2tt, a 
contradiction. □ 

In particular, the geodesies in Figure 20.9 will disappear after a generic small 
perturbation of the tetrahedron or the cube. 

20.7 Closed geodesies on regular polyhedra. The discussion in the pre- 
ceding section suggests that more symmetric polyhedra arc more amenable for con- 
structing closed geodesies. From this point of view, it is reasonable to investigate 
the case of regular polyhedra (see [16, 31] for such a study). 

A B A B AX 1 B 




AX B A 



FIGURE 20.10. Triangular tiling of the plane 

The simplest case is that of a regular tetrahedron. Denote a regular tetrahedron 
with the edge 1 by T and its vertices by A, B, C, D. Consider the standard tiling 
of the plane by equilateral triangles with the unit edge and label vertices by letters 
A, B, C, D as shown in Figure 20.10. There is a natural map it of the plane onto 
the tetrahedron T taking the vertices of the tiling into the vertices of T labeled by 
the same letters, edges into edges and triangles (tiles) into the faces. 

Take the coordinate system in the plane with the origin at a point labeled A 
and coordinate vectors AB, AC where B is the next right to A and C is immediately 
above AB. Take points X = (a, 0), < a < 1, on AB and X' = (a + 2p, 2q),q > 
p > 0, q > on another side labeled AB. If (p, q) = 1 and qa ^ Z, then the 
map 7r takes XX' into a simple (not repeating itself) closed geodesic on T of 
length \Jp 2 + pq + q 2 . Moreover, this geodesic is non-self-intersecting and all closed 
geodesies on T are given by this construction; in particular, all closed geodesies 
on T are non-self-intersecting (see Exercise 20.8). The geodesic corresponding to 
p = 2, q = 3 is shown in Figure 20.11. It cuts the tetrahedron into two pieces also 
shown in Figure 20.11. (In accordance with the Gauss-Bonnet Theorem, each of 
the two pieces contains two vertices.) 
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Figure 20.11. A closed geodesic on the tetrahedron 

A full description of closed geodesies on the regular octahedron is given in 
Exercise 20.9. Up to a parallelism and symmetries of the octahedron, there are only 
two non-self-intersecting closed geodesies, of lengths 3 and 2-^/3 (where the edge of 
the octahedron is unit); one of them is planar, one is not. There are also infinitely 
many non-parallel self-intersecting geodesies. The two non-self-intersecting closed 
geodesies and one self-intersecting closed geodesic are shown in Figure 20.12. 




Figure 20.12. Three closed geodesies on the octahedron 

An almost full description of closed geodesies in the cube is given in Exercise 
20.10. There are three types of non-self-intersecting closed geodesies (their lengths 
are 4, 3\/2, 2\/5) , and infinitely many types of self-intersecting closed geodesies. The 
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three non-sclf-intcrsccting and throe sclf-intcrsecting closed geodesies are shown, 
respectively, in Figures 20.13, 20.14. 







V-:: 


\ 

/ 

/ 







Figure 20.13. Simple closed geodesies on the cube 





Figure 20.14. Self-intersecting closed geodesies on the cube 

On a regular icosahedron, there are 3 (up to parallelism and symmetries of 
the icosahedron) non-self-intersecting closed geodesies (of lengths 5,3\/3 and 2\/7), 
and only one of them is planar. They are shown in Figure 20.15. Also, there are 
infinitely many self-intersecting closed geodesies. (See the details in [31].) 

20.8 Smooth surfaces: a panorama. In differential geometry, all the no- 
tions that we have discussed so far, are defined for smooth surfaces. The relation 
between the polyhedral and smooth cases is, roughly, the same as between polygonal 
lines and smooth curves in the plane discussed in Section 20.1. 

The definition of curvature is modeled on the statement of Corollary 20.2. Let 
X be a point of a surface. Consider a small neighborhood of point X of area A. 
At every point of this neighborhood consider the unit normal vector to the surface 
(so the surface looks like a porcupine). Translate the foot points of these normal 
vectors to the origin to obtain a piece of the unit sphere. Let A' be its area. The 
curvature of the surface at point X is the limit value of the ratio A' /A as the 
neighborhood shrinks to point X. 

For example, the curvature of a cylinder or a cone is zero: the end points of the 
unit normal vectors lie on a curve, whose area is zero. The curvature of a sphere 
of radius r is 1/r 2 . 

The relation of this definition to the definition of the curvature of a polyhedral 
cone in Section 20.3 is straightforward. Let C be a polyhedral cone. One can 
smooth its edges and its vertex to obtain a smooth surface, approximating the 
cone. The unit normal vectors to this surface describe a domain on the unit sphere. 
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Figure 20.15. Closed geodesies on the icosahedron 

For the cone, this domain becomes a spherical polygon P, the intersection of the 
sphere with the dual cone C* . The area of P is the curvature of C, as we know 
from Corollary 20.2. 

The curvature has sign. This is because the area A' is signed. Specifically 
traverse the boundary of a neighborhood of point X counter clock-wise. The end 
points of the unit normal vectors traverse a curve on the unit sphere; if this curve is 
oriented counter clock-wise, the curvature is positive, and if clock-wise - negative. 
Sec Figure 20.16 for the case of a torus. 



Figure 20.16. Positive and negative curvature 

Similarly to Lemma 20.1, the total curvature of a closed convex surface is 
47T. For more general closed surfaces, the answer is 27rx where \ is the Eulcr 
characteristic. 

The curvature of a polyhedral cone does not change under deformations that 
preserve flat angles. Likewise, the curvature of a smooth surface remains constant 
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under isometric deformations, the deformation that do not change the inner geome- 
try of the surface. 2 For example, a wide variety of developable surfaces are obtained 
by bending, without stretching, a sheet of paper (piece of plane), and they all have 
zero curvature; see Lecture 13. 

Let 7 be an oriented smooth curve on a smooth surface S. Parallel translation 
along 7 is defined as rolling, without sliding, of the tangent plane to S along 7. 
Equivalently, one may put S on the plane and roll the surface along the curve 7. 

The Gauss-Bonnet theorem holds in the smooth setting: parallel translation of 
the tangent plane to a surface S along an oriented simple closed curve 7 results in 
the rotation of the tangent plane through the angle, equal to the total curvature of 
S inside 7. 

A geodesic curve on a smooth surface S is defined as a curve 7 whose tangent 
vector is parallel translated along 7. If one rolls a surface along a geodesic curve, the 
trace on the horizontal plane is straight. Geodesies are trajectories of a free particle 
confined to S: their speed remains constant and their acceleration is orthogonal to 
the surface (that is, the only force acting on the particle is the normal reaction 
force). For example, the geodesies on a sphere are great circles. Just as in the 
polyhedral case, geodesies minimize the distance between pairs of their sufficiently 
close points. 

Unlike Theorem 20.4, every closed smooth convex surface carries a simple closed 
geodesic, in fact, even three: this was conjectured by Poincare; a proof was pub- 
lished by Lyusternik and Shnirelman in 1930. These three closed geodesies are 
manifest for an ellipsoid, they are its sections by the three planes of symmetry. 

Finally, define the geodesic curvature of an oriented curve on a smooth surface. 
Approximate a curve by a geodesic polygonal line, 7. The geodesic curvature is 
concentrated at the corners, and its value is the angular measure of the comple- 
mentary angle, positive if 7 turns left and negative if it turns right. The geodesic 
curvature of a geodesic line is zero. 

Let 7 be an oriented simple geodesic polygon. Parallel translation along 7 
results in rotation of the tangent plane, and the angle of this rotation complements 
the total geodesic curvature of 7 to 2tt, see Figure 20.17. This yields another version 
of the Gauss-Bonnet theorem: the total geodesic curvature of an oriented simple 
closed curve 7 plus the total curvature of the surface inside 7 equals 2ir. 

20.9 Three examples: tennis ball, Foucault pendulum, and bicycle 
wheel. Every tennis ball has an indented closed curve on its surface. Mark a point 
of this curve and put the ball on the floor so that it is touching the floor at the 
marked point. Now roll the ball without sliding along the curve until it again 
touches the floor at the marked point. From the initial to the final positions of the 
ball, it has made a certain revolution about the vertical axis. What is the angle of 
this revolution? 

The Gauss-Bonnet theorem provides an answer. The angle in question is the 
total curvature bounded by the curve. Although the curve has a complicated shape, 
a glance at a tennis ball reveals that this curve is symmetric and bounds exactly 
one half of the total curvature of the ball, that is, 2ir. Hence the angle of revolution 
is zero. 



2 This is the Teorema Egregium of C.-F. Gauss. 
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Figure 20.17. Parallel translation on a surface 



Our second example concerns the Foucault pendulum demonstrating rotation 
of the earth. The original pendulum was constructed by Leon Foucault for the 1850 
Paris Exhibition: this was a 67 meter, 28 kilogram pendulum suspended from the 
dome of the Pantheon in Paris; the plane of its motion, with respect to the earth, 
rotated slowly clockwise. 3 Now almost every science museum exhibits a Foucault 
pendulum. 

Imagine that the pendulum is suspended at the North Pole. The plane of 
its motion remains the same while the earth rotates eastward. Thus the plane of 
motion of the pendulum rotates, with respect to the earth, with angular speed of 
360°/24 = 15° per hour. The closer to the equator, the weaker the effect, and on 
the equator, the plane of motion of the pendulum does not rotate relative to the 
earth (by symmetry!) 

The Foucault pendulum is a purely geometric phenomenon. Its behavior is due 
to the motion of the suspension point. Let us imagine that the earth is not rotating, 
and only the suspension point of the pendulum is moving along a curve 7 on the 
surface of the earth. Assume further that 7 is a spherical polygon. As long as we 
arc moving along a geodesic segment, the plane of motion of the pendulum does 
not rotate relative to the direction of the geodesic. At a corner, the plane of motion 
of the pendulum remains the same but the direction of motion of the suspension 
point changes by the exterior angle 7. Thus the plane of motion of the pendulum 
turns relative to the direction of 7 by the exterior angle at the corner. 

Conclusion: the total rotation of the plane of motion of the pendulum equals 
the total geodesic curvature of the trajectory of its suspension point. By the Gauss- 
Bonnet theorem, this is 2tt minus the total curvature inside this trajectory. 

Consider Foucault pendulum at latitude ip. The trajectory of the suspension 
point is a latitude circle. The total curvature of the polar cap of the sphere bounded 
by this circle equals the area of the polar cap of latitude ip on the unit sphere; it 
is easily computed - see Exercise 20.3 - and the answer is 2n(l — sin^>). Thus the 
total rotation of the plane of motion of the pendulum is 27rsin tp. 

For Paris, ip is about 48°, and the angular speed of the Foucault pendulum in 
the Pantheon equals 11° per hour. 



3 The reader is recommended to the novel "Foucault's Pendulum" by Umberto Eco. 
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In physics, the rotation of the plane of motion of the pendulum is attributed 
to an inertia, Coriolis, force. The same force is accountable for the direction of 
the major circulations of air and wind on earth. It is observed that rivers of the 
Northern Hemisphere tend to erode chiefly on the right bank; those of the Southern 
hemisphere chiefly on the left bank. This is especially manifest for the great North 
flowing rivers in Russia, such as the Ob, Lena and Yenisey. This phenomenon is 
also often attributed to the Coriolis force although the issue remains somewhat 
controversial (and the Coriolis force is not responsible for rotation of water in 
bathtubs!) 

Finally, following M. Levi [51], one can use a bicycle wheel with frictionlcss 
bearings for physical realization of parallel translation. Keep the plane of the 
wheel tangent to the surface, and set the angular velocity of the wheel relative 
to its axis to zero. Guiding the center of the wheel along a curve on the surface, 
each spoke, considered as a tangent vector, undergoes parallel translation along 
the curve. (The explanation of this phenomenon is essentially the same as for the 
Foucault pendulum.) 
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20.10 Exercises. 

20.1. Find the sum of curvatures of an (n, fc)-star, that is, an n-pronged star 
making k turns; the numbers n and k are relatively prime. 

20.2. Let C be a convex cone and C* its dual cone. Show that (C*)* = C. 

20.3. Compute the area of the polar cap of latitude tp on the unit sphere. 

20.4. One throws a loop on a cone and pulls it down. If the cone is sharp, the 
loop will stay but if the cone angle is sufficiently obtuse the loop will slide off (of 
course, we assume that there is no friction). Find the critical cone angle separating 
the two cases. 
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20.5. A closed smooth simple curve is drawn in the plane. One places a convex 
body on the plane at some point of the curve and rolls the body, without sliding, 
all around the curve. Prove that the trace of the curve on the surface of the body 
cannot be a closed simple curve. 

Hint. Use the Gauss-Bonnet theorem and the fact that the total curvature of 
a closed simple plane curve is 2tt. 

The next problem concerns geometry of curves on the unit sphere and has much 
to do with the material of Section 10.2. 

20.6. Let 7 be a simple convex curve on the unit sphere. Move each point of 7 
distance tt/2 along the outer normal (that is, the orthogonal great circle). Denote 
the resulting curve 7* and call it dual to 7. 

(a) Show that ((7)*)* is the curve antipodal to 7. 

(b) Prove that the length of 7* equals 2ir minus the area bounded by 7. 

(c) Let 7 e be the curve obtained from 7 by moving each point distance e along 
the outer normal. Find the length of j £ and the area bounded by it. 

Denote by 7' the curve obtained from 7 by moving each point distance n/2 
along the tangent great circle to 7. The curve 7' is called the derivative of 7. 

(d) Let 7 be a circle of latitude <f). Find 7'. 

(e) Prove that 7' bisect the area of the sphere. 

20.7. Given a convex polytope P, denote by S(P) the sum of the solid angles 
at its vertices and by D(P) the sum of its dihedral angles. 

(a) If P is a tetrahedron, prove that S(P) - 2D(P) + 4tt = 0. 

Hint. Parallel translate the faces of P to the origin and consider the partition of 
the unit sphere by the respective half-spaces. Use the inclusion-exclusion formula. 

(b) In general, prove that S(P) — 2D(P) + 2nf — 4n = where / is the number 
of faces of P (Gram's theorem). 

Hint. Cut P into tetrahedra and use additivity of the desired relation. 

20.8. (a) Prove that the construction given in Section 20.7 gives all closed 
geodesies on a regular tetrahedron. 

(b) Prove that all closed geodesies on a regular tetrahedron are non-self-intersecting. 

20.9. A straight interval in the plane equipped with the standard triangular 
tiling as in Figure 20.10 (but without letter notations for the vertices) not passing 
through the vertices yields a geodesic on a octahedron, once the trianlge containing 
the initial point is identified with a face of the octahedron. Consider an interval 

with the endpoints (a,0), (a + k,£) with < a < 1, k,£ e Z, £ > k > 0, £ > 

£ 

0, a £ Z, as in Figure 20.10. Call {k, £) a good pair, if this interval corresponds 
{k,£j 

to a closed, non-self-repeating geodesic. 

(a) Let (p, q) = 1, q > p > 0, q > 0. If p = q mod 3, then (2p, 2q) is a good 
pair. If p ^ q mod 3, then (3p, 3q) is a good pair. 

(b) Up to parallelism and symmetries of the octahedron, the pairs from Part 
(a) yield all closed geodesies on the regular octahedron. 

(c) Only geodesies corresponding to the good pairs (0, 3) and (2, 2) are non- 
self-intersecting. 

20.10. A straight interval in the plane not passing through the vertices of the 
standard square tiling yields a geodesic on a cube, once the square containing the 
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initial point is identified with a face of the cube. We use the natural coordinate 
system in which the vertices of the tiling are the points with integral coordinates. 
Consider an interval with the endpoints (a, 0), (a + k,£) with < a < 1, <G 

Z, I > k > 0, £ > 0, ^ a ^ Z. Once again, (fc,^) is a 500c? pair if this interval 

corresponds to a closed, non-self-repeating geodesic. 

(a) Let (p,q) = 1. If p and g are both odd, then (3p,3q) is a good pair. If 
one of p, q is even, then cither (2p, 2q) or {Ap, Aq) is a good pair. (We do not know 
which one.) 

(b) Up to parallelism and symmetries of the cube, the pairs from Part (a) yield 
all closed geodesies on the cube. 

(c) Only the geodesies corresponding to the good pairs (0,4), (3, 3) and (2,4) 
are non-self-intersecting. 




LECTURE 21 

Non-inscribable Polyhedra 

21.1 Main theorem. The vertices of a randomly chosen convex polyhedron 
are not likely to lie on a sphere. For example, the pyramid in Figure 21.1 is not 
inscribed into a sphere if its quadrilateral base is not inscribed into a circle. But one 
can easily adjust the shape of the base so that it becomes an inscribed quadrilateral, 
and then the pyramid becomes an inscribed polyhedron. One is tempted to think 
that every convex polyhedron can be adjusted to get inscribed into a sphere. 




Figure 21.1. Deforming a pyramid to inscribe it into a sphere 

This is not at all so! In 1928, E. Steinitz proved the following theorem. 

Theorem 21.1. Let P be a convex polyhedron whose vertices are colored black 
and white so that there are more black vertices and no two black vertices are adja- 
cent. Then P cannot be inscribed into a sphere. 

The condition is purely combinatorial, so no deformation can make P inscribed. 
Thus P is a non-inscribable polyhedron. 

Here is an example of a polyhedron satisfying this condition. Consider an 
octahedron and color its vertices white. Attach a tetrahedron to every face (with 
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sufficiently small altitude, not to violate convexity) and color these new vertices 
black. We have 8 black and 6 white vertices, and no two black ones are connected 
by an edge, see Figure 21.2. Equally well, one may start with an icosahedron instead 
of an octahedron. 




Figure 21.2. A non-inscribablc polyhedron 

At first glance, the situation appears paradoxical. A regular octahedron is 
already inscribed into a sphere. One can easily choose the sizes of the tetrahedra, 
attached to its faces, so that their new vertices lie on the same sphere, and we 
obtain an inscribed polyhedron. What goes wrong is that this new polyhedron is 
not convex! 

Proof. Consider a sphere S and a wedge (formed by two planes) whose edge 
intersects S at points A and B. The intersection of the two planes, that make 
the wedge, with the sphere are two circles. Let a be the angle between the circles 
evaluated at point A. We shall call a the dihedral angle relative to the sphere or, for 
short, the relative dihedral angle. Clearly, one may equally well choose the point B: 
this yields a congruent angle, see Figure 21.3. An exterior relative dihedral angle 
is 7r — a where a is a relative dihedral angle. 

Consider now a convex polyhedral cone whose vertex A lies on the sphere S 
and whose edges intersect the sphere. Then the sum of its exterior relative dihedral 
angles is 2n. Indeed, one can use the tangent plane to S at A to evaluate the relative 
dihedral angles as follows. Move this plane parallel to itself inside the sphere so 
that it intersects all the edges of the cone. The intersection is a convex polygon 
whose angles are equal to the relative dihedral angles. But the sum of the exterior 
angles of any convex polygon is 2ir, and this proves our claim - see Figure 21.4. 

After this preparation, consider an inscribed polyhedron P satisfying the con- 
dition of the theorem. For every vertex, the sum of the exterior relative dihedral 
angles is 2ir. Let £ be the sum of these angles, taken with positive signs for the 
white vertices, and negative signs for the black vertices. On the one hand, there 
are more black vertices, so £ < 0. 
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Figure 21.3. Relative dihedral angles 




Figure 21.4. Proof of Theorem 21.1 



On the other hand, there are two kinds of edges: with two white vertices and 
with one white and one black vertex. The exterior relative dihedral angles at the 
end points of an edge are equal. For a white-white edge, the two contributions to 
S are positive, and for a white-black edge the contributions cancel. Thus £ > 0, a 
contradiction. □ 

21.2 Another example. Theorem 21.1 is a sufficient condition for convex 
polyhedron not to be inscribable, but by no means it is necessary. Consider the ex- 
ample shown in Figure 21.5. This truncated cube P does not satisfy the conditions 
of Theorem (21.1). Let us prove that P is not inscribable. 
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Figure 21.5. Truncated cube 



Consider a polyhedral cone with three faces and vertex A whose edges intersect 
a sphere S. 

Lemma 21.1. The sum of the relative dihedral angles is less, equal or greater 
than 7r depending on whether A lies outside, on or inside S. 

Proof. The faces of the cone intersect the sphere along three circles, and the 
angles between these circles are the relative dihedral angles. Pick a point of the 
sphere not on the circles and project S stereographically from this point. We obtain 
three circles in the plane, and the angles between the circles are the same as on the 
sphere (this is because stereographic projection preserves angles and takes circles 
to circles). 

If A lies on S then the three circles intersect at one point, and the sum of the 
angles is n, see Figure 21.6. If A lies off S, there are two cases, shown in Figure 
21.6, depending on whether A is outside or inside. In the former, the sum of angles 
is less than tt, and in the latter greater than tt. □ 



Let us return to the truncated cube P, and assume that it is inscribed into a 
sphere S. Denote by Q the initial cube, that is, the polyhedron whose truncation 
is P. Let us color the vertices of Q black and white so that the end points of each 
edge have opposite colors. Call the truncated vertex A and assume that it was 
white. 

Assign to each edge of Q the exterior relative dihedral angle of the respective 
edge of P. For each vertex of Q, except A, the sum of these angles is 2tt. The vertex 
A dearly lies outside of S so, by Lemma 21.1, the sum of the relative dihedral angles 




Figure 21.6. Mutual position of three circles 
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at A is less than n, and therefore, the sum of the exterior relative dihedral angles 
is greater than 2ir. 

Now, as in the proof of Theorem 21.1, sum up, with signs, these sums of relative 
angles over all vertices of Q. On the one hand, we get zero: every edge has one black 
and white end point. On the other hand, this sum is positive: the seven vertices of 
Q, 3 white and 4 black, contribute — 2tt to the total, and the contribution of A is 
greater than 2tt. A contradiction. □ 

In conclusion, let us mention that there is a third type of mutual position of 
a sphere and convex polyhedron: when all the edges are tangent to the sphere. 
P. Koebe proved in 1936 that such polyhedra realize all combinatorial types of a 
convex polyhedra. Much later, in 1992, O. Schramm proved in the paper titled 
"How to cage an egg" [69] that the sphere can be replaced by an arbitrary ovaloid. 



John Smith Martyn Green Henry Williams 

January 23, 2010 August 2, 1936 June 6, 1944 

21.3 Exercises. 

21.1. Make an explicit computation to check that if one attaches tetrahedra to 
the faces of an octahedron so that their vertices lie on the circumscribed sphere of 
the octahedron then the resulting polyhedron is not convex, see Figure 21.2. 

21.2. Prove that the stereographic projection from the sphere to the plane takes 
circles to circles and preserves the angles between them. 

21.3. * Let P be a polyhedron whose faces are colored black and white so that 
there are more black faces and no two black faces are adjacent. Then P is not 
circumscribed about a sphere. 

A simple example of such a polyhedron is obtained by cutting off all the vertices 
of a cube, see Figure 21.7. 

21.4. Consider a polyhedron such that every vertex is adjacent to the same 
number of faces. Prove that if the vertices are colored black and white in such a 
way that no two vertices of the same color are adjacent then the number of black 
vertices equals the number of white ones. 

21.5. Prove that the vertices of a polyhedron can be colored black and white 
in such a way that no two vertices of the same color are adjacent if and only if each 
face has an even number of edges. 

Hint. Color one vertex, then the adjacent vertices, then their adjacent ones, 
etc. This process either results in the desired coloring or there is a closed path 
made of an odd number of edges of the polyhedron. 



298 



LECTURE 21. NON-INSCRIBABLE POLYHEDRA 




Figure 21.7. A non-circumscribable polyhedron 




LECTURE 22 

Can One Make a Tetrahedron out of a Cube? 

22.1 Hilbert's Third Problem. Is it possible to cut a cube by finitely many 
planes and assemble, out of the polyhedral pieces obtained, a regular tetrahedron 
of the same volume? 

This is a slight modification of one of the 23 problems presented by David 
Hilbcrt in his famous talk at the Congress of Mathematicians in Paris, on August 
8, 1900; it goes under the number 3. Hilbert's problems had a tremendous impact 
on mathematics. Most of them were solved during the 20-th century, and each 
has a very special history. Still, the Third Problem remains exceptional in many 
respects. 

First, this was the first of Hilbert's problems to be solved. The solution be- 
longed to a 23 year old German geometer, Hilbert's student Max Dchn [20]. His 
article appeared two years after the Paris Congress, but the solution was found 
earlier, maybe, even before Hilbert stated the problem. 

Dehn's proof (more or less the same as the one presented below) was short and 
clear, and it became one of the favorite subjects for popular lectures, articles, and 
books in geometry, like the one you are holding in your hands. But among working 
mathematicians, it was almost forgotten. 

Certainly, the name of Dehn was not forgotten. He became one of the few top 
experts in the topology of three-dimensional manifolds, and his work of 1902 has 
been never regarded as his main achievement. 

In 1976, the American Mathematical Society published a two-volume collec- 
tion of articles under the title "Mathematical Developments Arising from Hilbcrt 
Problems" [93] . It was a very solid account of the three quarters of century history 
of the problems: solutions, full and partial, generalizations, similar problems, and 
so on. This edition contains a thorough analysis of 22 of 23 Hilbert's problems. 
And only the Third Problem is not discussed there. The opinion of the editors is 
obvious: no developments, no influence on mathematics; nothing to discuss. 

How strange it seemed just a couple of years later! Dehn's theorem, Dehn's 
theory, Dehn's invariant became one of the hottest subjects in geometry. This 
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was stimulated by then new-born K-theory, an exciting domain developed on the 
borderline between algebra and topology. We shall not follow this development, 
but shall just present the theorem and its proof. 

22.2 For a similar problem in the plane the answer is yes. 

Theorem 22.1 (Wallace, Bolyai, Gerwien). Let Pi and P 2 be two plane poly- 
gons of the same area. Then it is possible to cut Pi into pieces by straight lines and 
to reassemble these pieces as P 2 . 

Proof. First, it is clear that it is sufficient to consider the case when P 2 is a 
rectangle with one side of length 1 and with area Pi; in doing this, we can shorten 
the notation of Pi to just P. 

Second, since any polygonal domain can be cut into triangles, we can reduce 
the general case to that of a triangle (see Figure 22.1). 

area P 

Figure 22.1. Reducing the general polygonal case to that of a triangle 

Third, we need to transform, by cutting and pasting, a triangle into a rectangle 
with one of the sides having length one. This is done, in four steps, in Figure 22.2. 

First, we make a parallelogram out of our triangle (Step 1). Then we cut a 
small triangle on one side of the parallelogram and attach it to the other side in 
such a way that the length of one of the sides of the parallelogram becomes rational, 
p/q (Step 2). On Step 3, we make this parallelogram a rectangle (the number of 
horizontal cuts needed depends on the shape of the parallelogram). On the final 
step, we cut the rectangle into pq equal pieces by p — 1 horizontal lines and q — 1 
vertical lines (with the understanding that it is the vertical side of the rectangle 
that has the length p/q); then we rearrange these pq pieces into a rectangle with 
the length of the vertical side being 1. 

22.3 A planar problem which does not look similar to Hilbert's Third 
Problem, but has a similar solution. Is it possible to cut a 1 x 2 rectangle into 
finitely many smaller rectangles with sides parallel to the sides of the given rectangle 
and to assemble a y/2 x y/2 square? 

The answer is NO. The proof is more algebraic than geometric, but still, unlike 
the Hilbert Problem, it requires a small geometric preparation. 

22.3.1 A geometric preparation. Let us be given two rectangles with vertical 
and horizontal sides (below, we shall call such rectangles briefly I^fZ-rectangles), 
and suppose that it is possible to cut them into smaller F_ff-rcctanglcs such that 
the pieces of the first are equal (congruent) to the pieces of the second. 

Then there exists a collection of N (still smaller) FiJ-rectangles such that each 
of the given rectangles can be obtained by a sequence of TV — 1 admissible moves. 
An admissible move is: we take two of our small rectangles having equal widths or 
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Figure 22.2. Making a triangle into a rectangle 



{A) 



1 


3 


4 


5 


2 



(B) 



a 


b 


c 


d 


e 


f 


9 


h 


i 



1 


2 


5 


3 


4 




a 


9 


h 


/ 


d 


i 


b c 


e 



(C) 



(D) 



(E) 



Figure 22.3. Creating rectangles by admissible moves 



equal heights and attach them to each other vertically or horizontally creating one 
rectangle of the same width or height. Thus our process of cutting is replaced by 
the process of attaching rectangles. How to do this, is shown on Figure 22.3. 

Suppose that two rectangles are cut into equal pieces as requested by the Prob- 
lem (the rectangles (A) and (C) in Figure 22.3; equal pieces are marked there by 
the same Arabic numbers). Then we extend the sides of the pieces of the rectangle 
(A) to the whole width or length of this rectangle (see the rectangle (B) in Figure 
22.3). Some of the pieces of the division are cut into smaller pieces (marked by 
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Roman letters in the rectangle (B): so 1 becomes a union of a and d, 2 becomes 
a union of g and h, etc.) Then we divide in the same way the pieces of the second 
given rectangle (see the rectangle (D) in Figure 22.3; we break the rectangle 1 of 
the rectangle (C) into pieces congruent to a and d, the rectangle 2 into pieces g, h, 
and so on). We obtain a new division of the second given rectangle, C, into smaller 
rectangles, and again extend the sides of these smaller pieces to the whole width 
or length of the rectangle (see the rectangle (E) of Figure 22.3). These last pieces 
form our collection. Obviously we can assemble the rectangle (C) from these pieces 
using the admissible moves. Other admissible moves produce, out of our small 
rectangles, the parts of the finer division of the rectangle (A) (that is, a, b, c, . . . , i), 
and out of these parts we can assemble, using admissible moves, the rectangle (A) . 
The geometric preparation is over. 

22.3.2 An algebraic proof. Let us have a finite collection of VH -rectangles with 
total area 2. Suppose that one can compose out of these rectangles, using only 
admissible moves, a (1 x 2)-rectangle. Then it is impossible to compose out of these 
rectangles, using only admissible moves, a (\/2 x v / 2)-square. 

This is what we need to prove that the answer to the question of this section 
is negative. 

Let wi,...,Wn be the widths of the rectangles of our collection (N being the 
number of these rectangles), and hi, . . . , hjy be their heights. 
Consider the sequence 

(22.1) l,V2,wi,...,w N ; 

remove all terms in this sequence that are linear combinations of the preceding terms 
with rational coefficients. (Thus, we do not remove 1; we do not remove y/2, since 
it is irrational; we remove Wi, if and only if wi = ri + r2>/2, with rational ri,T2, 
and so on.) Let oi, . . . , a m be the remaining numbers (thus, ai = 1, ai = V2). It 
is important that each of the numbers (22.1) can be presented as a rational linear 
combination of the numbers a 1; . . . , a, m in a unique way. 

(This is a standard theorem from linear algebra, but for the sake of complete- 
ness, let us give a proof. 1 = ai is a rational linear combination of ai,...,a m , 
and so is \f2 = a 2 . Assume, by induction, that all the numbers (22.1) preceding 
Wk are rational linear combinations of tii, . . . , a m . If Wk is not a rational linear 
combination of preceding numbers, then it is one of a/s, and hence is a rational 
linear combination of ai, . . . , a m ; if is a rational linear combination of preced- 
ing numbers, then it is a rational linear combination of ai, . . . ,a m , since all the 
preceding numbers are rational linear combinations of ai, . . . ,a m . It remains to 
prove uniqueness. If two different rational linear combinations of ai, . . . , a m are 
equal, YllLi r 'i a i — jyjLi r j a ji an d s is the largest of 1, ... ,m, for which r' s ^ r", 
8-1 r' — r" 

then a s = > l —ai which shows that a s is a rational linear combination of 

^— ' r" — r' 

i=l s s 

preceding a/s, in contradiction to the choice of ai, . . . , a m .) 
Now, do the same with the sequence 

(22.2) l,V2,hi,...,h N . 

We shall get the numbers bi, . . . ,b n with bi = 1, 62 = \/2 such that each of the 
numbers (22.2) can be presented as a rational linear combination of the numbers 
6„ in a unique way. 
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Call a rectangle admissible, if its width is a rational linear combination of 
ai, . . . , a m and its height is a rational linear combination of 61, ... , b n . Let P be 
an admissible rectangle of the width w and the length h, and let w — Y^T=i r i a i 
and h = Y^j=i s jbj w i tn rational r^s an d s/s. We define the symbol Symb(P) 
of the rectangle P as the rational m x n matrix \\Sij\\ with SV, = TiSj. We shall 
use the notation Symb(P) = . r^Sj ® bj for the symbols (which is simply 
an alternative notation for the matrix above). Thus, we regard the symbols as 
"formal rational linear combination" of the "expressions" ai®bj. Such formal linear 
combinations can be added in the obvious way; we consider two formal rational 
linear combinations J2i,j ^ij a i ® bj, J2 t j ^'lj a i ® bj equal if t'^ = t" 3 for all 

Let P' and P" be two admissible rectangles of equal heights or equal widths. 
Then we can merge these two rectangles into one rectangle, P, using an admissible 
move. Obviously, P is also an admissible rectangle, and Symb(P) = Symb(P') + 
Symb(P"). Indeed, if P' and P" have widths w' = J2T=i r i a * and w " = E™ 1 r " a i 
and the same height h = ^™ =1 Sjbj, then P has the width w'+w" = ^™ i{ r 'i+ r 'i') a i 
and the height h, and 

Symb(P) =E i jW+0* j a i ®6 j 

= Ei,j r'^j <H ® bj + J2i,j r"sj <n ® 6j 
= Symb(P') +Symb(P"). 

Thus, if we have a collection of admissible rectangles, Pi, ... , Pjy, and can assem- 
ble out of them, by N — 1 admissible moves, a rectangle P, then Symb(P) = 
EiLi Symb(Pi). If we can assemble in this way two different rectangles, P and P', 
then Symb(P') = Symb(P). This proves our theorem, since the symbol of a 1 x 2 
rectangle is 2(ai ® 61), and the symbol of a \/2 x \/2 square is 02 ® &2 which is 
different. □ 



22.4 Proof of Dehn's Theorem. We want to prove the following. 

Theorem 22.2. Let C and T be a cube and a regular tetrahedron of the same 
volume. Suppose that each of them is cut into the same number of pieces by planes. 
( That is, we cut our polyhedron into two pieces, then cut one of the two pieces into 
two pieces, then cut one of the three pieces into two pieces, and so on.) It is not 
possible that the two collections of (polyhedral) pieces are the same. 

Proof. Let £1, . . . ,£n be the lengths of all edges of all polyhcdra involved in 
the two cutting processes. Let ipi, . . . ,ipx be the corresponding dihedral angles 
(we suppose that < <pi < ir for all i). Take the sequence £1, . . . ,£n and remove 
from it any term which is a rational linear combination of the previous terms; we 
obtain a sequence ai, . . . , a m such that each of the £^s is equal to a unique rational 
linear combination of a^s. Then do the same with the sequence ir, tpi, . . . , ipx; the 
resulting sequence is denoted by a = w, ai, . . . , a n , and each of the ipts is equal 
to a unique linear combination of a^'s. Call a convex polyhedron admissible, if the 
length of every edge is a rational linear combination of ai , ... , a m and each dihedral 
angle is a rational linear combination of a , an, . . . , a n . 

Let mi, ■ ■ ■ , m q be the lengths of edges of an admissible convex polyhedron P, 
and let ipi, . . . , ip q be the corresponding dihedral angles. Let rrik — r ki&i and 

ipk = Y^j=o s kj a j- Similarly to the symbol considered in the previous section, we 
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define the Dehn invariant of P by the formula 

m n / q \ 

Dchn(P) = y^y^ I ^r kl s k] J a l ®a j . 
i=i j=i \k=i / 

Important remark: it is not a misprint that the second summation is taken 
from j = 1 to n, not from j = to n; we do not include into the Dehn invariant 
the summand s^ott. Thus, if one changes an angle by a rational multiple of tt, then 
the Dehn invariant is not affected; if some dihedral angle is a rational multiple of 
7T, then the corresponding edge does not appear in the expression for the Denh 
invariant at all. 

Example 22.1. The Dehn invariant of a cube (or of a rectangular box) is zero. 
Indeed, all the angles are ir/2. 

Lemma 22.2. Let P be a convex polyhedron. Suppose that it is cut by a plane 
L into two pieces, P' and P" . Then (provided that P,P', and P" are admissible) 

Dchn(P) = Dchn(P') + Dchn(P"). 

Proof. Let S — {ei, . . . , e q } be the set of all edges of P, let Ik be the length of 
the edge and ipk be the corresponding dihedral angle. We divide the set S into 
four subsets: S\ consists of edges which have no interior points in L and lie on the 
P' side of L; S 2 is the similar set with P" instead of P'; S 3 consists of edges e fe cut 
by L into an edge e' k of P' and an edge e k of P"; and £4 consists of edges which 
are totally contained in L; for each et G S4, the dihedral angle ipk is divided by L 
into two parts: ip' k and ip' k . Consider also the intersection Lfl P. This is a convex 
polygon; each £ S4 is its side; let T — . . . , f p } be the set of the other sides. 
Each /fc is a side of both P' and P"; let nik be the length of fk and x'k' x'L the 
corresponding dihedral angles in P' and P" . Obviously, x'k + x'l = 7r - 

The edges of P' are: 
the edges eu € S\, the lengths are £/,, the angles are ipk', 

- the edges e' k for £ S3; the lengths are £' k , the angles are ipk] 

- the edges e S4; the lengths are Ik, the angles are ip' k ; 
the edges f k £ T; the lengths are m k , the angles are x'k- 

The edges of P" are: 

- the edges &k £ S2', the lengths are Ik, the angles are ipk] 

- the edges e' fe ' for € 53; the lengths are £' k , the angles are ipk; 

- the edges € ^4; the lengths are ik, the angles are ip k ; 

- the edges £ T; the lengths are mk, the angles are Xfe- 

The Dehn invariant of each of the polyhcdra P',P", and P consists of four 
groups of summands; for P' and P" these groups correspond to the four groups 
of edges as listed above; for P they correspond to the sets Si, S2, S3, S4. The first 
group of summands in Dchn(P') is the same as the first group of summands in 
Dehn(P). The first group of summands in Dehn(P") is the same as the second 
group of summands in Dchn(P). The sum of the second groups of summands 
in Dehn(P') and Dehn(P") is the third group of summands in Dehn(P) because 
l' k +£' k = Ik- The sum of the third groups of summands in Dehn(P') and Dehn(P") 
is the fourth group of summands in Dchn(P) because ip' k + ip k = ipk- Finally, the 
sum of the fourth groups of summands in Dehn(P') and Dehn(P") is zero, since 
X'k + X'k = t- Tnus > Dchn(P) = Dchn(P') + Dchn(P") as stated in the Lemma. □ 
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Figure 22.4. Proof of Lemma 22.2 



An example is shown in Figure 22.4. A polyhedron P (a four-gonal prism with 
non-parallel bases, shown at the left of the first row) is cut into two polyhedra by a 
plane (the cut is shown in the first row, the polyhedra P' and P" are shown in the 
second row). The edges of P are ei, . . . ei 2 ; the sets Si are: Si — {ei, e 3 , e^, es}, S 2 — 
{e 6 , e 7 , eg, e w , en, ei 2 }, S3 = {e 8 }, S 4 = {e 2 }. 

Back to Theorem 22.2. If two polyhedra can be cut into the same collection of 
polyhedral parts, then their Dchn invariants are both equal to the sum of the Dchn 
invariants of the parts, and, hence, the Dehn invariants of the given two polyhedra 
are equal to each other. But the Dehn invariant of a cube is equal to zero, since all 
the angles are ir/2 (see Example 22.1). The Dehn invariant of a regular tetrahedron 
is equal to 6(£ <g> a) where I is the length of the edge and a is the dihedral angle. 
All we need to check is that a is not a rational multiple of n. 
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FIGURE 22.5. The dihedral angle of a regular tetrahedron 

The dihedral angle of a regular tetrahedron is the largest angle of an isosceles 
\/3 \/3 

triangle whose sides are t,l—^-,t—^- (see Figure 22.5). The cosine theorem shows 
that 

U4) 2 + U4) 2 - e ! 

cos a = -. — — —7 = - 

2 (^) (^) 3 

1 a 

Lemma 22.3. //cos a = -, then — is irrational. 

3 7T 

Proof. Otherwise, cos na = 1 for some n. However, it is known from trigonom- 
etry that 

cos na — P„(cos a) 

where P n is a polynomial of degree n with the leading coefficient 2™ _1 (cf. Lecture 
7). 

This is proved by induction. Statement: for all n, 

cos na = P„(cos a) , sin na — Q„(cos a) ■ sin a 

where deg P n — n, dcg Q n = n — 1, and the leading coefficients of both P n and Q n 
are equal to 2™ _1 . For n = 1, this is true (Pi(t) — t, Qi(t) — 1); assume that the 
statement is true for some n. Then 

cos(n+l)a — cos na sin a — sin na sin a 

= P n (cosa) cos a — Q n (cosa) sin 2 a 

= P n (cosa) cos a + Q n (cosa) (cos 2 a — 1 ) ; 

sin(n+l)a = sin na cos a + cos na sin a 

= Qn(cos a) sin a cos a + P„(cos a) sin a 
= (Q ra (cosa) cos a + F„(cos a)) sin a 

Hence, 

P n+ i(t) =P„(t)t + Q„(t)(i 2 -l), 

Qn+l(*) -Qn(t)t + P«(t), 

and the statement for the degrees and the leading terms follows. 
This shows that 

/1\ 2™ _1 an integer 
cos na = P n - = — 1 — : 

V 3 / 3 ™ 3"- 1 

which cannot be an integer, in particular, 1. □ 

This proves Lemma 22.3 and completes the proof of Dchn's theorem. □ 
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22.5 Further results. In the language of algebra (which may be technically 
unfamiliar to the reader, but the formulas below seem to us self-explanatory), the 
construction of the previous section assigns to every convex (actually, not necessar- 
ily convex) polyhedron a certain invariant, 

Dehn(P) eR® q (R/ttQ), 

and Dehn's theorem states that if two polyhedra, Pi and are equipartite (that 
is, can be cut by planes into identical collections of parts), then 

Dehn(Pi) = Dchn(P 2 ). 

This is precisely the result of the previous section. 




Figure 22.6. These tetrahedra do not have to be equipartite 

Certainly, this may be applied not only to cubes and tetrahedra. The initial 
Hilbcrt's problem, by the way, dealt with a different example; Hilbert conjectured 
that two tetrahedra with equal bases and equal heights (like those on Figure 22.6) 
are not equipartite. 




Figure 22.7. Computing the volume of a tetrahedron by a limit- 
ing process 

The origin of this question belongs to the foundations of geometry The whole 
theory of volumes of solids is based on the lemma stating that the volumes of the 
tetrahedra in Figure 22.6 are the same. The similar planar lemma (involving areas 
of triangles) has a direct geometric proof based on cutting and pasting. But the 
three-dimensional fact requires a limit "stair construction" involving pictures like 
Figure 22.7 (you can find a figure like this in textbooks on spatial geometry). The 
question is, is this really necessary, and the answer is "yes": Dehn's theorem easily 
implies that the tetrahedra like those in Figure 22.6 are not, in general, equipartite. 

More than 60 years after Dehn's work, Sydler proved that polyhedra with equal 
volumes and equal Dehn invariants are equipartite [76]. There are similar results 
in spherical and hyperbolic geometries. 
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Dehn's invariant may be generalized to polyhedra of any dimension: for an 
n-dimensional polyhedron P, 



Dehn(P) = ^ volume(s) 

(n — 2)-dimensional 
faces s of P 



dihedral 
angle at s 



e R <g>Q (R/ttQ) 



(the angle is formed by the two (n — l)-dimcnsional faces of P adjacent to s). 
In dimension 4, as in dimension 3, two polyhedra are equipartite, if and only if 
their volumes and their Dehn invariants are the same. But in dimension 5 it 
is not true any longer: there arises a new invariant, a "secondary Dehn invari- 
ant" involving a summation over the edges (for an n-dimcnsional polyhedron, over 
(n — 4)-dimensional faces) of P. There is a conjecture that an "equipartite type" of 

\n + 11 

an n-dimensional polyhedron is characterized by a sequence of — - — invariants: 

the volume, Dehn invariant, secondary Dehn invariant, and so on, taking values in 
more and more complicated tensor products (fc-th Dehn invariant involves a sum- 
mation over (n — 2/c)-dimcnsional faces. In particular, for one- and two-dimensional 
polyhedra (segments and polygons) only the "volume" (the length and the area) 
matters; in dimensions 3 and 4 we also have Dehn invariant, and so on). 

For more information about this subject, we recommend the popular book of 
Boltianskii [9], the talk of Cartier at the Bourbaki Seminar [13] and the books 
[25, 67, 91]. 



John Smith Martyn Green Henry Williams 

January 23, 2010 August 2, 1936 June 6, 1944 

22.6 Exercises. 

22.1. Prove that the Dehn invariant of any rectangular prism with a polygonal 
base is zero. 

Exercises 22.2-22.4 are particular cases of Sydler's theorem (see Section 22.5). 
Since we did not prove this theorem, we suggest to solve these exercises by direct 
constructions. 

22.2. Prove that two collections of parallelepipeds of equal total volumes are 
equipartite. 

22.3. A regular octahedron O with the edge 1 can be obtained from the regular 
tetrahedron T with the edge 2 by cutting off 4 regular tetrahedra T with the edge 
1 contained in T and containing the 4 vertices of T. Since, obviously, Dehn(T) = 
2Dehn(T), 

Dehn(O) = Dchn(f) - 4Dchn(T) = -2Dchn(T). 
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Prove that the collection of the octahedron O and two tetrahedra T is equipartite 
to a cube of the appropriate volume (6Vol(T)). 

22.4. (a)Let T and T denote the same as in Exercise 22.3. Prove that T is 
equipartite to a collection of two copies of T and a cube. 

(b) Generalization. Let P be an arbitrary polyhedron and P is its double 
magnification (of the volume 8Vol(P)). Prove that P is equipartite to a collection 
of two copies of P and a cube of the volume 6 Vol(P). 

Hint, (a) follows from Exercise 22.3; to prove (b), observe that (a) holds for any 
(not necessarily regular) tetrahedra, and then cut P into the union of tetrahedra. 

22.5. A polyhedron P is called a crystal if there exist a tiling of whole space 
by polyhedra congruent to P. Prove that the Dehn invariant of a crystal is 0. 




LECTURE 23 

Impossible Tilings 

23.1 Introduction. This lecture is about tilings of plane polygons by other 
plane polygons. An example of such a problem is probably known to the reader: 

Two diagonally opposite squares (Al and H8) are deleted from a chess board. 
Can one tile this truncated board by 2 x 1 "dominos"? See Figure 23.1. 



FIGURE 23.1. Can one tile this truncated chess board by dominos? 

A typical fragment of a tiling by dominos is shown in Figure 23.2. The tiles do 
not overlap (they touch each other along parts of their boundaries) , and every point 
of the board belongs to a tile. Note two things: we allow both horizontal and vertical 
positions of the tiles, and we do not assume that adjacent tiles necessarily share a 
full side. In general, a tiling problem is formulated as follows: given a polygon P 
and a collection of polygons Qi, Q 2 , ■ ■ ■ , is it possible to tile P by isometric copies 
of the tiles Qi? 

In case the reader failed to solve the truncated chess board problem, its (nega- 
tive) solution appears in the next section. In what follows, we shall see many more 
examples of impossible tilings but the proofs will get more and more involved. 
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Figure 23.2. A fragment of tiling 



23.2 Coloring. To solve the truncated chess board tiling problem, recall that 
the chess board has a black-and-white coloring. The diagonally opposite squares 
are both black, and the truncated board has 30 black and 32 white squares. On 
the other hand, every 2x1 domino covers one black and one white square. Hence 
the tiling is impossible, see Figure 23.3. 



There is an alternative way to present the black-and-white coloring argument. 
Write in each white square and 1 in each black one. The total sum of these 
numbers on the truncated board is 30. But every domino has a and a 1 written 
on it, and 31 dominos will make the total sum equal to 31, not to 30. Thus no 
tiling exists. 

Here is a variation on this argument. Can one tile a 10 x 10 square by L- 
shaped tiles as shown in Figure 23.4? Note that a tile may now have 8 different 
orientations! 

The answer again is in the negative. Write Is and 5s in the squares as depicted 
in Figure 23.4. Each tile covers either three Is and one 5, or three 5s and one 1. In 
either case, the sum on a tile is a multiple of 8. On the other hand, the total sum 
of the numbers on the board is 300, which is not divisible by 8. Hence the tiling 
does not exist. 

23.3 What a coloring argument cannot do. Imagine that we have two 
kinds of tiles: the usual, positive, ones and the negative ones, made of "anti-matter" . 
We are allowed to superimpose tiles so that the common parts of the positive and 
negative ones annihilate each other, see Figure 23.5. It is convenient to write 1 on 
each positive tile and —1 on each anti-tile. The multiplicity of a point is the sum 
of these ±ls, taken over all tiles that cover this point. We say that a polygon P 




Figure 23.3. Coloring argument 
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Figure 23.4. Coloring modulo 8 

admits a signed tiling if one can superimpose negative and positive tiles so that the 
multiplicity of every point inside P equals 1. 

= ELB 



Figure 23.5. Tiles and anti-tiles 

It is clear that if a coloring argument, like the ones discussed in Section 23.2, 
proves that a polygon cannot be tiled by a certain collection of tiles, then this 
proof implies that a signed tiling does not exist cither. There are, however, tiling 
problems that have solutions in signed tiles and no solutions in positive tiles only. 

Consider a triangular array of dots as in Figure 23.6. We want to cover this 
triangle by "tribones" consisting of three dots; a tribone may have one of the three 
indicated orientations. For which values of n does such a tiling exist? 




Figure 23.6. Can one color a triangle by tribones? 

First of all, for a tiling to exit, the number of dots must be a multiple of 3. 
This number is n(n + l)/2, and hence n = or 2 mod 3. 

Let us now "color" the dots as shown in Figure 23.7. The sum of numbers 
covered by each tile is divisible by 3. The total sum depends on n periodically with 
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period 9, and its value mod 3 is as follows: 

0,2,2,2, 1, 1, 1,0,0. 

Therefore n mod 9 should be either 1, or 8, or 0. We already know that n = or 
2 mod 3, so only the latter two cases "remain on the table" . 



1 1 
2 2 2 

11111 
2 2 2 2 2 2 

11111111 
222222222 
0000000000 

Figure 23.7. Coloring modulo 3 

Let us show that if n = 8 or mod 9 then there exists a signed tiling of the 
triangular array by tribones. Figure 23.8 depicts such a tiling for n — 8, and Figure 
23.9 shows how to build bigger arrays from triangles of size 8 and rows of tribones. 




FIGURE 23.8. Signed tiling for n = 8 

We conclude that the following surprising theorem is beyond the reach of any 
coloring argument. 

Theorem 23.1. For every n, a triangular array of size n cannot be tiled by 
tribones. 

23.4 Conway's tiling group. To prove Theorem 23.1, we need to do some 
preparatory work. To fix ideas, assume that all the polygons, the region to be tiled 
and the tiles, are drawn on graph paper, that is, are composed of unit squares. We 
assume that every polygon involved does not have holes: its boundary consists of 
a single closed curve. 



LECTURE 23. IMPOSSIBLE TILINGS 



315 




Figure 23.9. Building larger signed tilings from smaller ones 

A path on the square grid will be described by a word in four symbols x, x^ 1 , y 
and y~ x : a step right by x, a step left by a step up by y and a step down by 
y^ 1 . See Figure 23.10, for example. We write k consecutive symbols x or x~ x as 
x ±k , and likewise for y, and denote by e a trivial path. We also cancel consecutive 
x and x^ 1 or y and y~ 1 ; for example, xyy~ x x~ x = e. 



xy 2 xy l x 



Figure 23.10. A path and the respective word 



To compose two words a and b, consider their concatenation and reduce it by 
canceling all consecutive pairs of x and x^ 1 or y and y^ 1 . The resulting word is 
denoted by ab. 

The composition satisfies the associativity law: (afe)c = a(bc), where a, b and c 
stand for any words. Given a word w, the word w^ 1 is obtained from w by reading 
its letters from right to left and changing their exponents to the opposite. For 
example, (xy^ 1 )^ 1 = yx^ 1 . Clearly, to -1 = e. 

Let Ti, . . . ,T n be a complete list of tiles, positioned on the grid in all possible 
orientations (so that a domino has two, and an L-shaped tile eight different orien- 
tations) . Choose a starting point on the boundary of Tj and traverse this boundary 
counter clock-wise. This closed path is encoded by a word Wi in x, x^ 1 , y and 
This word depends, of course, on the choice of the starting point. 
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So far, the only rule for manipulating with words was 

xx^ 1 = x~ x x = e = yy^ 1 = y~ 1 y. 

Let us add to this rule the new ones: W\ — W 2 — ■ ■ ■ = e. These rules mean that 
whenever one of the words Wi appears in a longer word, we may replace it by e, 
and conversely, we may insert either of the words Wi anywhere. If a word V\ can 
be obtained from another word V 2 by consecutive applications of these rules, we 
call them equivalent and write simply V\ = V2. 

We need to address an ambiguity in the choice of the words W,, namely, their 
dependence on the starting point. Let p' be another starting point on the boundary 
of the tile Tj, and let W[ be the word obtained by traversing the boundary starting 
at p' . 

Lemma 23.1. One has: W' = e. 




FIGURE 23.11. Proving Lemma 23.1 

Proof. Let u be (the code of) the path from p to p', and v the path from p' to 
p, see Figure 23.11. Then Wi = uv and W[ — vu. Since Wi — e, we have uv = e. 
Then vu — = u~ 1 (uv)u = %r x u = e, as claimed. □ 

Let P be a polygon. Traverse its boundary to obtain a word U (again depending 
on the starting point). The following proposition provides a necessary condition 
for tiling. 

Proposition 23.2. If P is tiled byTi,...,T n then U = e. 

Proof. Induction on the number of tiles. If this number is one then P is itself 
a tile, say, Tj. The word U is then what we called W[ above, and the claim follows 
from Lemma 23.1. 

Now suppose there are several tiles. Then we can cut the polygon P into two 
polygons, Pi and P 2 , by a path inside P, going from a boundary point p to a 
boundary point p' and traveling only on the boundaries of the tiles, see Figure 
23.12. Let w be the word corresponding to this path pp' inside P, and let v\ and 
i>2 be the boundary words of P from p to p' and from p' to p, respectively. 

A closed counter clock-wise path along the boundary of P, starting at point p', 
is encoded by the word viv 2 - We have: viv 2 — (viw~ 1 )(wv 2 ). The words in the 
parentheses are the boundary words of the polygons Pi and P 2 . By our choice of 
the cutting path pp' , these two polygons are tiled by a smaller number of tiles. By 
the induction assumption, viw^ 1 = e and wv 2 = e. Therefore viv 2 = e as well. 1 

Finally, the boundary word U may differ from viv 2 by the choice of the starting 
point. We already know from Lemma 23.1 that if one of these words is equivalent 
to e then so is the other. This completes the proof. □ 



x Notc similarity of this argument to the one from the proof of the polyhedral Gauss-Bonnet 
theorem 20.3. 
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FIGURE 23.12. Induction step in the proof of Proposition 23.2 



It is convenient to summarize the constructions of this section in terms of 
groups. The collection of tiles T 1; . . . ,T n determines a group with two generators 
x and y and the relations W±, . . . , W n . This group is called Conway's tiling group. 
The boundary word of the polygon P is an clement of Conway's tiling group, and 
if P is tiled by T\, . . . , T n then this is the unit clement. 



Example 23.3. Let us revisit the truncated chess board problem from the very 
beginning of this lecture. 

The two positions of the 2x1 dominos have the boundary words W\ = 

1 y~ 2 , and the truncated chess board has the bound- 
7 xy~ 1 , see Figure 23.13. We want to show that the 
e; then, by Proposition 23.2, the 



x 2 yx~ 2 y~ 1 and W 2 — xy 2 x 
ary word U — x 7 y 7 x~ 1 yx~ 7 y 
equalities W\ = W2 = e do not imply that U 
board cannot be tiled. 



x 7 



9 —9 —1 

x yx y 



9 —^ —9 
xy x y 



Figure 23.13. Truncated chess board revisited 



Replace x by the permutation (213) and y by (132). Then x 2 and y 2 both 
become equal to the trivial permutation (123), and hence both words W\ and W2 
become trivial as well. It follows that if U = e then, after x and y are replaced by 
the permutations (213) and (132), we must obtain a trivial permutation. 

But this is not the case! The reader will easily verify that U = (312), a non- 
trivial permutation. 2 



2 In group theoretical terms, we constructed a homomorphism from Conway's tiling group to 
the group of permutations of three elements; this homomorphism takes the boundary word of the 
truncated chess board to a non-trivial permutation. 
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23.5 Proof of Theorem 23.1. It should not come as a surprise that Theorem 
23.1 will take more work compared to Example 23.3: after all, the truncated chess 
board problem has an easy coloring solution. 

First of all, we know from Section 23.3 that a necessary condition for a tiling 
is that n = 8 or mod 9. If a tiling exists for n = 8 mod 9 then, as Figure 23.9 
shows, it also exists for n = mod 9. Thus it suffices to prove that the tilings do 
not exist for n a multiple of 9. 

Redraw the triangular array of dots as a staircase- like polygon on graph paper: 
one square in the first row, two in the second, etc. Then the tribones become the 
kinds of tiles, depicted in Figure 23.14. The figure also indicates the boundary 
words of these polygons. We want to prove that, for every n, the equalities W\ — 
W2 = W3 = e do not imply that U n = e. 



W 1 = x 3 yx 3 y 1 



_...JL 

U n = y- n x n (yx- 1 ) n 

Figure 23.14. Tiles and the corresponding words 

Consider three families of evenly spaced oriented parallel lines, see Figure 23.15. 
These lines intersect at angles of 60° and form a tessellation of the plane by equilat- 
eral triangles and regular hexagons. Write letters x and y in the triangles as shown 
in Figure 23.15. We shall refer to this pattern of lines and letters as the hexagonal 
grid. 

The hexagonal grid is very symmetric. For its every two vertices, there exists a 
motion of the plane that takes one to another and preserves the grid. For example, 
a parallel translation takes vertex B to D in Figure 23.16, and the rotation through 
120° about point A, the center of a triangle marked x, takes vertex B to C. 

A path on the square grid can be shadowed on the hexagonal grid. A path 
on the square grid is encoded by a word in x, x , y, y ■ At every vertex of the 
hexagonal grid, two oriented lines meet, and two of the four angles are labeled x and 
y, see Figure 23.15. We interpret the symbols x, x~ x ,y, y~ x as instructions to build 
the shadow path: a;* 1 means "move one step on the boundary of the angle labeled 
x, along or against the orientation, respectively", and likewise for y ±x . Thus, once 
one chooses a starting point, a path on the square grid determines a path on the 
hexagonal grid. 

Figure 23.15 shows shadows, on the hexagonal grid, of the boundary paths of 
the three tiles from Figure 23.14. Note that all three shadow paths are closed; this 
fact holds true for any choice of the starting point of a shadow path, due to the 
symmetries of the hexagonal grid. In contrast, the path xyx~ 1 y~ 1 , that is, the 
boundary of a single square, has a non-closed shadow. 




W 2 = xy 3 x~ 1 y~ 3 W 3 = (yx 1 ) 3 (y x x) 3 
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Figure 23.15. The hexagonal grid: the shadows of the three tiles 




FIGURE 23.16. Symmetries of the hexagonal grid 

Let us consider only those paths on the square grid whose shadows on the 
hexagonal grid are closed. The boundary paths of the three tiles satisfy this prop- 
erty and so does the boundary path of the staircase region in Figure 23.14; its 
shadow is shown in Figure 23.17 (we use the assumption that n is a multiple of 3). 

An oriented closed curve partitions the plane into a number of components. 
To each component, there corresponds the rotation number of the curve about any 
point of this component. We discussed this notion in Lecture 12, see Figure 12.20. 
The signed area, bounded by a closed curve, is the sum of areas of the components, 
multiplied by the respective rotation numbers. For example, a counter clock-wise 
oriented unit circle has signed area n, and a clock-wise oriented one has signed 
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Figure 23.17. The shadow of the staircase region 

area —it. In Calculus, signed area is defined as the integral, over the curve, of the 
differential form xdy. 

Assign to a path on a square grid the signed area, bounded by its shadow on 
the hexagonal grid. For the boundary paths of the three tiles, this signed area is 
zero, see Figure 23.15. And for the boundary path of the staircase region, this 
signed area is negative, see Figure 23.17. 

This implies that U n ^ e. Indeed, when we replace one of the words W\,Wi 
or W 3 by e, or vice versa, the signed area of the shadow path is not affected. This 
area is zero for the trivial word e but it is non-zero for the word U n . This completes 
the proof of Theorem 23.1. □ 

In conclusion of this section, here is another theorem that can be proved sim- 
ilarly to Theorem 23.1. Start with the same triangular array of dots but now we 
want to tile it by triangles made of three dots, see Figure 23.18. 




n 

Figure 23.18. Can one tile the large triangle with small ones? 
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Theorem 23.2. Such a tiling exists if and only if n = 0,2, 9, or 11 mod 12. 

For more information on Conway's tiling group, see [17, 65, 85]. 

23.6 Back to Max Dehn. After solving Hilbert's Third problem, M. Dehn 
[21] proved in 1903 the following theorem. 

Theorem 23.3. If a rectangle is tiled by squares then the ratio of its side lengths 
is a rational number. 

The converse is obviously true, see Figure 23.19. The following proof is quite 
similar to what we did in Section 22. 3. 3 



p = 4 

q = 7 

Figure 23.19. If the ratio of the sides of a rectangle is rational 
then it can be tiled by squares 

Proof. Let us argue by contradiction. We can scale the rectangle so that its 
width is 1; let x be its height, an irrational number. 



Figure 23.20. Extending the sides of the tiles 

Assume that there is a tiling by squares. Extend the sides of the squares to 
the full width or height of the rectangle, see Figure 23.20. Now we have a tiling 
of our x x I rectangle and of all the squares by a number of smaller rectangles; let 
ai, . . . , djv be their side lengths (in any order). Consider the sequence 

(23.1) l,x,ai,...,a N ; 

remove a term if it is a linear combination, with rational coefficients, of the pre- 
ceding terms. Since x is irrational, it will stay in the sequence. Let b\ = 1,6 2 = 
x, 6 3 , . . . , b m be the remaining numbers. As in Section 22.3, each of the numbers 
(23.1) is a unique rational linear combination of 61, ... , b m . 
Let / be the following function on the numbers b\ , . . . , b m : 

/(l) - 1, f(x) = -1, /(6s) - • • • = f(b rn ) = 0; 



3 And this similarity is the reason for a lecture on tiling problems to be included into a chapter 
devoted to polyhedra. 
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extend / to rational linear combinations of the numbers 61 , . . . , b m by linearity: 

f{rih H h r m b m ) = n/(6i) H h r m f(b m ). 

Thus, if u and w are rational linear combinations of the numbers bi,...,b m , then 

(23.2) f(u + v) = f(u) + f(v), 

that is, the function / is additive. 

Consider a rectangle with side lengths u and v, both rational linear combina- 
tions of the numbers 61, ... , b m . Define the "area" of this rectangle as f(u)f(v). If 
two such rectangles share either a horizontal or a vertical side, they can merge to- 
gether to form a bigger rectangle. Due to the additivity of the function /, equation 

(23.2) , the "area" of the bigger rectangle is the sum of "areas" of the two smaller 
ones. 

It follows that the "area" of the x x 1 rectangle is the sum of the "areas" of the 
squares that tile it. The former is f(x)f(l) = —1, while the area ofauxii square 
is f(u) 2 , a non-negative number. This is a contradiction. □ 

23.7 Tilings by squares and electrical circuits. Consider a tiling of a 
rectangle by squares, such as in Figure 23.21. Let x±,. . . ,xg be the side lengths of 
the squares. For each segment in this figure, horizontal or vertical, we have a linear 
relation between the variables Xf. these relations express the length of a segment 
as the sum of the sides of the squares, adjacent to this segment on its two sides. 
For the tiling in Figure 23.21, these equations are: 

(23.3) x 2 — Xi + x 5 , x 3 + x 5 — x e , x\ + x 4 = x 7 + x s , x 6 + x s = x 9 
(horizontal segments), and 

(23.4) x\ = x 2 + x 4 , x 7 = x s + xg, x 4 + x 8 = x 5 + x 6 

(vertical segments). For a tiling to exist, this system of linear equation should 
have a solution in positive numbers. The tiling in Figure 23.21 corresponds to the 
following solution: 

x\ = 15, x 2 = 8, x 3 = 9, X4 = 7, x 5 = 1, xq = 10, x 7 = 18, x 8 = 4, x 9 = 14; 

of course, one can multiply these numbers by any factor. 

Equations (23.3) and (23.4) can be interpreted at Kirchhoff laws for electrical 
circuits. An example of a circuit is shown in Figure 23.22. We assume that all the 
resistors are unit, and that the currents are given by the numbers x t . There are two 
Kirchhoff laws: the vertex equations state that the flow of current into every vertex 
equals the flow out of it, and the mesh equations state that the voltage drop around 
any closed path is zero. Since the resistors are unit, by Ohm's law, the voltage drop 
on the z-th resistor equals Xi, the current. The vertex equations for the circuit in 
Figure 23.22 are precisely the equations (23.3), and the mesh equations arc the 
equations (23.4). 

The circuit in Figure 23.22 is constructed from the tiling in Figure 23.21 as 
follows: to every horizontal segment there corresponds a vertex in the electric 
circuit, and each square in the tiling corresponds to a resistor. A resistor connects 
two vertices if the respective square is adjacent to the two corresponding horizontal 
lines. This construction works for any tiling of a rectangle by squares and provides 
an electrical circuit. 
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x 8 



x 7 



Xg 



Figure 23.21. Tiling by squares 




FIGURE 23.22. The circuit corresponding to the tiling in Figure 23.21 

A choice of a voltage drop between the upper and lower vertices uniquely 
determines the currents in all resistors, and we obtain a solution of the system 
(23.3)-(23.4). In particular, the system (23.3)-(23.4) has a unique solution, up to a 
common factor. The same conclusion holds for any tiling of a rectangle by squares. 
The downside of this method is that we have no control on the signs of the currents: 
some of them may be zero or negative, and then the circuit will not correspond to 
a tiling by squares. 

23.8 Tilings by rectangles with an integer side. 

Theorem 23.4. A rectangle R is tiled by rectangles each of which has an integer 
side. Then R has an integer side. 
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This tiling theorem has a record number of different proofs (14 given in [87], 
and more are known). We choose one of the most elegant. 

Proof. The integral J sin 2nx dx over an interval of integer length is zero. It 
follows that the double integral 

over each tile is zero. Hence this double integral, evaluated over R, vanishes as 
well. Assume that the lower left corner of R is the origin and its sides have lengths 
a and b. Then 

= / / sin 2ttx sin 2ny dxdy — - — — (1 — cos 2na)(l — cos 2nb). 
Jo Jo ( 27r ) 

It follows that cither cos 2ira = 1 or cos 2irb = 1 , that is, either a or b is an integer. 
□ 

Theorem 23.4 has an interesting consequence. Suppose that an to x n rectangle 
is tiles by p x q rectangles (the numbers to, n,p and q are integers). Of course, this 
implies that pq divides mn. We can say more: 

Corollary 23.5. The number p divides either to or n, and so does q. 

Proof. Rescale by the factor l/p: now an (m/p) x (n/p) rectangle is tiled by 
1 x (q/p) rectangles. By Theorem 23.4, cither m/p or n/p is an integer, that is, p 
divides either morn. Similarly for q. □ 



John Smith Martyn Green Henry Williams 

January 23, 2010 August 2, 1936 June 6, 1944 

23.9 Tilings by triangles of equal areas, briefly mentioned. In conclu- 
sion, we cannot help mentioning one more, extremely intriguing, "impossible tiling" 
result: one cannot tile a square by an odd number of triangles of equal areas (for 
any even number of tiles, see Figure 23.23). This theorem is relatively new (1970) 
and has a very surprising proof. Perhaps, even more surprisingly, there are quadri- 
laterals that cannot be tiled by any number of triangles of equal areas. See chapter 
5 of [75] for an exposition. 

23.10 Exercises. 

23.1. Can one tile the polygon in Figure 23.24 by dominos? 

23.2. Delete one black and one white square from the chess board. Prove that 
the truncated board can be tiled by dominos. 

Hint. Consider a closed path that covers all all the squares of the chess board 
and place the dominos along it. 
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Figure 23.23. Tiling by triangles of equal areas 




7/ 



Figure 23.24. Variation on the coloring argument 

23.3. Show that a 10 x 10 square cannot be tiled by 1 x 4 rectangles. 
Hint. Use 4-coloring. 

23.4. ** Prove Theorem 23.2. 

23.5. Prove that the polygon in Figure 23.25 cannot be tiled by squares (it 
clearly can if we allow anti-tiles!). 



i/2-l 



V2 



i/2-l 



V2 



FIGURE 23.25. This region cannot be tiled by squares 



23.6. Let x = 2 - Tile a square by three rectangles similar to the lxi 
rectangle. 
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Comment. A square can be tiled by rectangles similar to the 1 x x rectangle if 
and only if x is a root of a polynomial with integer coefficients and, for a polynomial 
of least degree satisfied by x, every root a + ib satisfies a > 0, sec [30]. 

23.7. Show that Theorem 23.3 still holds true if one is allowed to use tiles made 
of "anti-matter". 

Hint. Redefine the "area" of a u x v rectangle as uf(v) — vf(u). This area is 
again additive and it vanishes for all squares. 

23.8. Give a coloring proof of Theorem 23.4 considering an infinite chess board 
with (1/2) x (1/2) squares. 

Comment. This is the same as to replace the function sin 2ttx sin 2ny by the 
function (_l)P*](_i)Pi»]. 




LECTURE 24 

Rigidity of Polyhedra 

24.1 Cauchy Theorem. A cardboard model of a convex polyhedron P is cut 
along its edges into a number of polygons, the faces of P. One has a complete list 
of adjacencies: when faces F{ and Fj shared an edge E^. Following this list, one 
assembles a polyhedron P' by pasting the faces along the same edges as in P. Is 
P' necessarily congruent to PI 




Figure 24.1. These polyhedra are combinatorially the same and 
have congruent faces 

The answer depends on whether P' is a convex polyhedron. Without the con- 
vexity assumption, the polyhedron is not uniquely determined, see Figure 24.1. 
However, for convex polyhedra, one has the following Cauchy theorem (1813). 

Theorem 24. 1 . If the corresponding faces of two convex polyhedra are congru- 
ent and adjacent in the same way then the polyhedra are congruent as well. 
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In the plane, a similar statement clearly fails: every polygon, except a triangle, 
admits deformations so that the lengths of the edges remain the same but he the 
angles change, sec Figure 24.2. 




Figure 24.2. Plane polygons are flexible 

A consequence of Theorem 24.1 is the Cauchy rigidity theorem: a convex poly- 
hedron cannot be deformed. A more precise formulation is as follows. 

Corollary 24.2. If a convex polyhedron is continuously deformed so that all 
its faces remain congruent to themselves then the polyhedron also remains congruent 
to itself. 

This result is in stark contrast with the constructions of flexible (non-convex!) 
polyhedra described in Lecture 25. 

A continuous version of Cauchy's theorem, due to Cohn-Vossen, states that 
smooth closed convex surfaces (ovaloids) are rigid: an isometric deformation is a 
rigid motion. It is not known whether smooth non-convex closed surfaces admit 
non-trivial isometric deformations. 

24.2 Proof of Cauchy's theorem. The proof is based on two lemmas. The 
first is combinatorial (or, one may say, topological). 

Suppose that some of the edges adjacent to a vertex of a convex polyhedron are 
marked + or — (and some edges are not marked at all) . Let us make a full circuit 
around the vertex keeping track of the signs of the edges. We say that a sign change 
occurs if a positive edge follows a negative one, or a negative edge follows a positive 
one; the unmarked edges are ignored. For example, there are 4 sign changes in 
Figure 24.3. 




Figure 24.3. Four sign changes 
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Lemma 24.1. Assume that some edges of a convex polyhedron are labeled + or 
— . Let us mark the vertices adjacent to at least one labeled edge. Then there exists 
a marked vertex such that, going around this vertex, one encounters at most two 
sign changes. 

The second lemma is geometrical. Consider two convex spherical (or plane) 
n-gons Pi and P 2 whose corresponding sides have equal lengths. Mark by + or — 
the vertices of P\ according to whether the corresponding angle of Pi is greater or 
smaller than that of P 2 ; if the angles are equal, the vertex is not marked. 

Lemma 24.2. If there are marked vertices at all, that is, the polygons are not 
congruent, then, going around polygon Pi , one encounters at least four sign changes. 

An explanation is in order. In spherical geometry, the role of straight lines is 
played by great circles. The shortest segment between two points is the smaller 
arc of the great circle through these points. With this convention, the definition of 
convexity is the same as in the plane. 

Probably Lemma 24.2 is historically the first in a long series of geometrical the- 
orems involving the number four (four vertex theorems); quite a few are discussed 
in Lecture 10. 

The rest of this lecture is devoted to proofs of Lemmas 24.1 and 24.2. But first 
we deduce Cauchy's theorem from them. 

Proof of Cauchy's theorem. Assume that the faces of two convex polyhedra 
Si and 5*2 are congruent and adjacent in the same way. If the polyhedra arc not 
congruent then some of their corresponding dihedral angles are not equal. Label 
the edges of Si by + or — sign according to whether the corresponding dihedral 
angle of Si is greater or smaller than that of S 2 ; if the angles are equal, the edge is 
not labeled. 

By Lemma 24.1, there is a vertex V\ of the polyhedron Si, adjacent to some 
labeled edges and with no more than two sign changes around it. Let V 2 be the 
corresponding vertex of S 2 . Consider the unit spheres centered at Vi and V 2 ■ The 
faces of the polyhedra Si and S 2 , adjacent to Vi and V 2 , intersect the spheres along 
convex spherical polygons, Pi and P 2 . The lengths of the sides of these polygons 
equal the angles of the respective faces of the polyhedra Si and S 2 . Therefore Pi 
and P 2 have equal corresponding edges. 

The vertices of the spherical polygons Pi and P 2 are the intersections of the 
respective edges of Si and S 2 with the spheres, and the angles of Pi and P 2 are 
equal to the respective dihedral angles of Si and S 2 . Hence the marking of the 
vertices of the polygon Pi, as described in Lemma 24.2, coincides with the labels of 
the edges of the polyhedron Si according to the dihedral angles. In particular, the 
number of sign changes around Pi is not greater than two. But by Lemma 24.2, 
this number is at least four, a contradiction. □ 



24.3 Euler's theorem and the proof of Lemma 24.1. The classical Euler 
theorem relates the number of vertices v, edges e and faces / of a convex polyhedron: 
v — e + f = 2. For example, a dodecahedron has 20 vertices, 30 edges and 12 faces: 
20-30 + 12 = 2. 
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We need a little more general result concerning graphs on the sphere (central 
projection of a a convex polyhedron on a sphere whose center lies inside the poly- 
hedron yields such a graph). Denote by v, e, / the number of nodes, edges and faces 
and by c the number of components of the graph. 

Theorem 24.3. One has: 



Proof. The argument goes by induction on the number of edges. 

Assume that the graph has a vertex V of valence 1, that is, a vertex adjacent to 
exactly one edge, say, E. Delete V and E (but do not delete the other end point of 
E). Then the numbers v and e decrease by 1. Since the edge E does not separate 
faces, the number / remains intact, and so does c, the number of components of 
the graph. Therefore the number v — e + f — c does not change. 

Next assume that all vertices have valences 2 or higher. Then there exists a 
closed non self-intersecting path in the graph. Indeed, choose a vertex, say, V\. 
There is an edge going from this vertex. Go to the other end-point of this edge, ~V 2 . 
The valence of V 2 is not less than 2, so there is another edge going from V 2 - Let V 3 
be the other end-point of this edge, etc. We continue until we return, for the first 
time, to an already visited vertex. This yields a closed non self-intersecting path. 

This path separates the sphere into two components (see Lecture 26 for a 
discussion of the Jordan Theorem). Delete one edge from this path (but do not 
delete its end points). Then the number / decreases by 1, and so does e, while v 
and c remain the same. Again v — e + f — c docs not change. 

Continue in this way until all edges are deleted. Then the graph consists of 
v isolated vertices, has / = 1 face and c = v components, and the relation (24.1) 
holds. □ 

Now we proceed to the proof of Lemma 24.1. The labeled edges of a convex 
polyhedron form a graph which we think of as drawn on the sphere. Let v, e, / and 
c have the same meaning as before, and let s be the sum of the numbers of sign 
changes over all vertices of the graph. The number of sign changes around a vertex 
is even. Hence Lemma 24. 1 will follow if we show that the average number of sign 
changes per vertex is less than 4, that is, s < 4v. Following Cauchy, one has a 
stronger estimate. 

Proposition 24.3. One has: 



Proof. Instead of going around the vertices of the graph let us traverse the 
boundaries of all the faces. The total number of sign changes s will be the same: 
indeed, two edges are neighbors when going around a vertex if and only if they are 
neighbors when traversing the boundary of a face, see Figure 24.4. 

When traversing the boundaries, we make the following conventions: 

a) if the boundary of a face has many components, we traverse them all and add 
the number of sign changes; 

b) if a segment is adjacent to the face on both sides, we treat this segment as two- 
sided, both sides carrying the same sign; such a segment will be traversed twice, 
once on each side. 



(24.1) 



v — e + f = c + 1. 



(24.2) 



s < Av - 8. 
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/ » 
/ 1 



Figure 24.4. Counting the number of sign changes in two ways 




Figure 24.5. There are 8 sign changes on one of the boundary 
components and 2 on the other 

This is illustrated in Figure 24.5: the total number of sign changes, contributed 
by the quadrilateral face, is 8. 

Denote by fi the number of faces whose boundary consists of i edges. Here 
i > 3, and an edge is counted twice if it is adjacent to the face on both sides. For 
example, the boundary of the face in Figure 24.5 has 13 edges. Thus 

(24.3) / - h + fi + h + . . . . 

When one traverses the boundary of a domain with i edges, one encounters at most 
i sign changes, and if i is odd, at most i — 1 ones. Therefore 



(24.4) 



s< 2/ 3 + 4/4 + 4/ 5 + 6/ 6 + 6/ 7 + 
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Each edge cither belongs to the boundary of two faces or is counted twice in the 
boundary of one face, hence 

(24.5) 2e - 3/ 3 + 4/ 4 + 5/ 5 + . . . . 

It follows from Eulcr's formula (24.1) that v — e + f > 2, or Av — 8 > 4e — 4f. 
Substitute / and e from (24.3) and (24.5): 

4«- 8 > (6/ 3 + 8/ 4 + IO/5 + ...)- (4/3+4/4+4/5 + ...) = 2/3+4/4 + 6/5+8/6 + . . . 
and the right hand side is not less than that of (24.4). This completes the proof. □ 

24.4 Arm Lemma and proof of Lemma 24.2. The following statement is 
known as the Cauchy Arm Lemma. 1 

Lemma 24.4. Let P\ . . . P n and P[ . . .P' n be two convex spherical or plane poly- 
gons. Assume that \PiPi + i\ = |-P/-P/ + i| for i — 1,2, . . . ,n — 1 and ZPiPi + \Pi + 2 < 
ZP/P/ +1 P/ +2 fori = l,...,n-2. Then |P 1 P n | < \P[P' n \ with equality only if all 
the corresponding angles are equal, see Figure 24-6. 




Pi 

Figure 24.6. The Cauchy Arm Lemma 

One may view P\ . . . P n as robot's arm: when the arm opens the distance be- 
tween the "shoulder" and "tips of the fingers" increases. This fact is intuitively 
quite clear; it is interesting that Cauchy 's proof contained an error that went un- 
detected for about 100 years. The proof below is due to I. Schoenberg. 

Proof of the Arm Lemma. Induction on n. When n = 3, the result is obvious: 
if two triangles have two pairs of congruent corresponding sides then the third side 
opposite the greater angle is greater; see Figure 24.7. 

Let n > 4. If the two polygons have equal angles, say, at vertices Pi and P-, 
then one may cut off these vertices by the diagonals Pi^\Pi+\ and P!_ 1 PL 1 . Since 
these diagonals are equal, we are reduced to the same statement but with n one 
less. 

Thus we assume that each angle of the first polygon is smaller than the re- 
spective angle of the second one. Let us start increasing the angle P n - 2 Pn-iPn 
by rotating the side P„_iP„ about the vertex P n -i, keeping the polygon con- 
vex, until one of the two things happens: either ZP„_2-Pn-i-fri becomes equal to 
^^n-2-fn-i^n or we reac h the situation when the vertices Pi, P2 and P n lie on one 



^^This lemma was stated and proved, in different terms, by Legendre in 1794. Legendre also 
conjectured that convex polyhedra were rigid. 
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Figure 24.7. The side, opposite the greater angle, is greater 

line, see Figure 24.8. We obtain a new polygon Pi . . . P n , and in both cases, the 
side P\P n has increased - see the first paragraph of the proof applied to the triangle 
P\P n -iP n - 




P 2 Pi Pn 



Figure 24.8. Inductive proof of the Cauchy Arm Lemma 

In the first case, we obtain two n-gons satisfying the conditions of the Arm 
Lemma having a pair of equal corresponding angles. This case was already dealt 
with in the second paragraph of the proof. 

In the second case, we ignore the first vertices in both polygons and apply 
the induction assumption to the polygons P 2 . . . P n and P' 2 . ..P' n to conclude that 
\PkP' n \ > \P2Pnl Then 

\P[Pn\ > \%K\ - > \P*Pn\ - \PlPl\ = \PlP n \, 

where the first inequality is the triangle inequality. This concludes the proof. □ 

It remains to prove Lemma 24.2. This is not hard, given the Arm Lemma. 

Proof of Lemma 24.2. The number of sign changes being even, assume first 
that there are two sign changes. Then we can number the vertices of the polygon 
P consecutively so that the first k vertices A\ , . . . , A k are all positive or unmarked 
and the remaining n — k vertices Ak+i, • • • , A n are all negative or unmarked. Let 
Bi, . . . , B n be the respective vertices of P 2 . 

Choose points C and D on the sides A^A^i and A n Ai, and let E and F be 
points on the sides BkBk+i and B n B\ so that \A^C\ = \BkE\ and |A„D| = \B n F\, 
see Figure 24.9. 

Apply the Arm Lemma to polygons DA\ . . . A^D and FB\ . . . B^E to conclude 
that I CD I > \FE\. Similarly, applied to polygons CA k+1 . . . A n D and EB k +i ■ . ■ B n F, 
the Arm Lemma yields: \CD\ < \FE\, a contradiction. 
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Ak + 1 



Si 
F 



IE 

1 Bk+i 



Figure 24.9. Proof of Lemma 24.2 

Finally, if there are no sign changes at all, let us assume that all signs are 
positive. Then, by the Arm Lemma, |AiA n | > \BiB n \, again a contradiction. □ 



John Smith 
January 23, 2010 



Martyn Green 
August 2, 1936 



Henry Williams 
June 6, 1944 



24.5 Exercises. 

24. 1 . Prove that every polyhedron has two faces with the same number of sides. 

24.2. Prove that every convex polyhedron has cither a triangular face or a 
vertex incident to three faces (or both). 

24.3. * Prove the following continuous analog of Lemma 24.2: given two plane 
ovals, let ds and dsi be the arc length elements at points with parallel and equally 
oriented exterior normals. Then the ratio ds\/ds has at least four extrema. 

24.4. * Let P and P' be plane convex n-gons, n > 4, whose sides have lengths 
£\ , . . . , £ n and £[,... ,£' n . Assume that the corresponding sides of the polygons are 
parallel to each other. Consider the cyclic sequence 



a, = 



ti+i 



£' 



/ \ &i — 1 

Prove that either <n = for all % or ai > for at least four values of i. 
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24.5. * Prove a continuous analog of the Arm Lemma: given two smooth convex 
arcs 7i(s) and 72(5) of equal lengths and parameterized by arc length, if their 
curvatures satisfy the inequality ki(s) > k 2 (s) for all s then the chord subtended 
by 72 is not less than that subtended by 71. 




LECTURE 25 

Flexible Polyhedra 

25.1 Introduction. This lecture is closely related to the previous one (Lec- 
ture 24), but they can be read independently, in particular, in either order. Again, 
we consider polyhedra made of rigid (say, metallic) faces attached to each other 
along edges of equal lengths by hinges which do not obstruct changing angles be- 
tween faces. Except several clearly specified cases, polyhedra are assumed "com- 
plete" which means that every edge belongs to precisely two faces. Our problem 
is: is it possible to bend the polyhedron without deforming its faces (Figure 25.1). 
The readers of the previous lecture are familiar with the following theorem. 




FIGURE 25.1. Can a polyhedron be flexible? 

Theorem 25.1 (Cauchy, 1813). Any convex polyhedron is rigid (cannot be 
bent). 

Here we shall prove the following, quite unexpected result. 
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Theorem 25.2 (Connelly, 1978). There exists a (non-convex) flexible polyhe- 
dron. 

Unlike Cauchy's theorem, the theorem of Connelly can be proved by providing 
one single example of a flexible polyhedron. We shall construct such a polyhedron 
quite explicitly, and if you have appropriate materials (rigid cardboard and tape), 
you will be able to make a model of such polyhedron and to feel its flexibility with 
your own fingers. 

25.2 Bricard's octahedron. One can ask why it took so long (more than 
150 years) after the Cauchy theorem to find the Connelly example. Of course, 
questions like that can never be answered with certainty, but we can try to guess. 
The rigidity problem was well known and respected among geometers, but almost 
all of them believed and tried to prove that the answer is positive: all polyhedra, 
convex or not, are rigid. (By the way, the efforts of these geometers were not totally 
fruitless: the rigidity of polyhedra was established under conditions much milder 
than convexity.) Connelly, on the other hand, had the courage to doubt. And then 
he noticed that almost all necessary mathematical work was done in the 1890s by a 
French mathematician and architect Raoul Bricard. Bricard was able to construct 
a flexible polyhedron which, however, not only fails to be convex, but also has a 
self-intersection. So, Connelly looked for a tool to make the Bricard polyhedron 
free of self-intersections, and he found such tools - again in Bricard's construction. 

Bricard's polyhedron is an octahedron, in the sense that it consists of eight 
triangular faces attached to each other precisely in the same way as the faces of 
Plato's regular octahedron. To construct the Bricard octahedron, we need two 
simple observations. The first is that a pyramid without bottom (this is an "in- 
complete" polyhedron) is flexible if and only if the number of its (triangular) faces 
is more than 3, see Figure 25.2. 1 Note that the (missing) bottom of this pyramid 
is not supposed to be flat (the points A, B, C, D are not assumed to belong to one 
plane for either of the 4-gonal pyramids of Figure 25.2). 




rigid D D 



FIGURE 25.2. Rigid and flexible pyramids 
The second observation is the following lemma. 

Lemma 25.1. Let ABCD be a non-planar spatial quadrilateral such that AB = 
CD and BC = AD. Let E,F be the midpoints of "diagonals" AC and BD. Then 
EF _L AC and EF _L BD. 



1 This observation was also made in Lecture 20. 
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FIGURE 25.3. Proof of Lemma 25.1 

Proof. (Figure 25.3). Draw the segments AF and CF. Since AABD = ACBD 
(their sides are equal), ZADB = ACBD. Hence, AADF = ACBF (they have 
two pairs of equal sides forming equal angles). Thus, AF = CF, hence AACF 
is isosceles and its median FE is also its altitude. So, EF _L AC, and we can 
establish, in the same way, that EF _L BD. □ 

Lemma 25.1 can be reformulated in the following way: a spatial quadrilateral 
with equal opposite sides is symmetric with respect to the line joining the midpoints 
of its diagonals. In this form, it can be regarded as a spatial version of the well- 
known theorem stating that the diagonals of a parallelogram bisect each other. 

Now we are prepared for Bricard's construction. Take a spatial quadrilateral 
ABCD with equal opposite sides, AB — CD, BC = AD. Lemma 25.1 provides for 
this quadrilateral an axis of symmetry; denote it by I. Take two points, M and 
TV, different from each other and from each of A, B, C, D, and also symmetric with 
respect to I. (To visualize the construction better, you may choose the point M 
and TV sufficiently far away from the quadrilateral ABCD). Bricard's octahedron is 
the union of 8 triangles: ABM, BCM, CDM, DAM, ABN, BCN, CDN, DAN (see 
Figure 25.4). Some faces intersect each other: in Figure 25.4, EF is the intersection 
line of the faces ABN and CDM, the faces ABN and BCM meet at BE, and the 
faces CDM and ADN meet at FD. 

Theorem 25.3 (Bricard, 1897). The Bricard octahedron is flexible. 

Proof. We shall consider the Bricard octahedron as the union of two 4-gonal 
pyramids: ABCDM and ABCDN. According to the first observation above, the 
(bottomless) pyramid ABCDM is flexible. Its deformation retains the relations 
AB = CD and BC = AD, thus the "base" ABCD of the pyramid has a line of 
symmetry at every moment of the deformation. If we reflect the varying pyramid 
ABCDM in this line, we shall get a deformation of the pyramid ABCDN, and 
together these two deformations form a deformation of the Bricard octahedron. □ 

25.3 Geometry of Bricard's octahedron. Of the geometric observations 
we are going to make in this section only the last one will be needed later. Still the 
fascinating properties of the Bricard octahedron deserve a detailed consideration. 

First, the Bricard octahedron has axial symmetry: the midpoints of the "diag- 
onals" AC, BD, and MTV lie on one line, and the whole octahedron is symmetric 
in this line. This gives the simplest construction of the Bricard octahedron: take 
a line in space, take three pairs of points, symmetric with respect to this line (no 



340 



LECTURE 25. FLEXIBLE POLYHEDRA 




Figure 25.4. Bricard's octahedron 



four of them should lie in one plane), denote these pairs of points by A and C, B 
and D, and M and N, and you are done. 

Since the Bricard octahedron is always self-intersecting, you cannot make a 
good model of it. But it is possible to make a model including 6 of 8 faces of 
the octahedron. (Notice that the octahedron in Figure 25.4 will be free of self- 
intersections, if you remove the faces ABN and CDN.) To create your model, you 
can use a 6-triangle development like the one shown in Figure 25.5. 




Figure 25.5. A development of the Bricard octahedron less two faces 
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You need to cut a polygon ABMCBAKD out of a thin and rigid cardboard, 
then fold it along the segments AM, MB, BC, CN, and ND in such a way that the 
two segments AD come together (A to A and D to D); attach them to each other 
by a tape. The whole "strip" should be twisted twice (twice as many times as we 
twist a strip to make a Mobius band). You can make a development of your own, 
but it is important that 

AB = CD, BC = AD, AM = CN, 

AN = CM,BM = DN,BN = DM; W 

in particular, the two pentagons ADMCB and CBN AD are identical (with their 
decompositions into triples of triangles), but opposedly oriented. The figure ob- 
tained will be bounded by two triangles, NAB and MCD, which will be linked (as 
they are linked in Figure 25.4). 

You will be surprised how flexible your model is (with the triangular faces 
remaining rigid!). It can be deformed to look as shown in Figure 25.6, left, or 
Figure 25.6, right. 




Figure 25.6. What the flexible model can look like 



It is interesting that to be flexible, Bricard's octahedron needs to be symmetric. 
If you slightly distort the sizes of the triangles in Figure 25.5 in such a way that 
the equalities (*) fail to hold (still the two segments AD should be equal), you can 
make a model indistinguishable by sight from the previous model, but it will be 
rigid, and you will be able to feel the difference with your fingers. (How slightly 
the sizes should be varied, depends on the quality of your materials, the cardboard 
and the tape.) 

Another way to make a non-sclf-intcrsecting incomplete Bricard's octahedron 
is to remove two faces which share an edge, say, AMB and ANB. You can cut the 
6-triangle development which is shown in Figure 25.7, left, and then attach to each 
other the two segments AD above the plane CMDN and the two segments BC 
below this plane. (Actually, this figure is too symmetric, all we need is the equalities 
(*); but it is more convenient to deal with an excessively symmetric octahedron.) 
You will get a flexible polyhedral surface with a 4-gonal edge (AMBN) as shown 
in Figure 25.7, right; note that the distance AB does not change in the process of 
deformation. 

You cannot add to this model either of the missing faces AMB or ANB, 
because of self-intersections. But still you can make your polyhedron complete, 
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Figure 25.7. Another model of Bricard's octahedron 

although infinite. Namely, replace the face AMB by the complement of it in the 
half-plane bounded by AB (see Figure 25.8) and do the same with the face ANB. 



M 




Figure 25.8. A replacement for a face 

You will get a complete flexible polyhedron (resembling an open book) with 6 
finite faces and 2 infinite faces. It is shown in Figure 25.9, left, and its side view, 
important for the next section, is shown in Figure 25.9, right. (The meaning of the 
arrow in this figure will be also explained in the next section.) 




Figure 25.9. Bricard's octahedron as an open book 



25.4 Connelly's construction. We start with a very degenerate Bricard oc- 
tahedron. Take a (planar) rectangle ABCD (AB < CD) and a point M inside 
this rectangle such that MA — MB < MC — MD. Break the rectangle into 4 
triangles: AMB, BMC, CMD, DMA. Take another copy of this rectangle and a 
point N symmetric to M with respect to the center of the rectangle. Then break 
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Figure 25.10. Starting point of the construction: a degenerate 
Bricard octahedron 

the second rectangle into the triangles ANB, BNC, CND, DNA. After this, put 
the first copy of the rectangle onto the second one as shown in Figure 25.10. 

The 8 triangles AMB, . . . , DNA, although they all lie in one plane, form a 
Bricard octahedron, which is flexible within the class of self-intersecting polyhedra. 
(Figure 25.6, right, may serve as a right picture of this deformation.) 

We can make the self-intersections less dramatic if we erect pyramids on some 
of the faces of this octahedron. It should be noticed that we shall not destroy the 
flexibility of a polyhedron if we replace some faces by pyramids based on these faces 
(sec Figure 25.11); the pyramids will stay rigid during the deformation. 




Figure 25.11. Adjoining pyramid to faces 

Add pyramids to the faces of the flat octahedron of Figure 25.10 (we need, 
actually, only 6 pyramids, since the small triangles AMB and CND will not touch 
any other faces). We get a flexible polyhedron with 20 faces and with a very mild 
self-intersection: there will be two pairs of crossing edges. (The two halves of this 
polyhedron are shown in Figure 25.12, and the crossing point of the edges are 
marked as E and F in this figure.) 

The dihedral angles at the crossing points have no other common point, they 
only touch each other as shown in Figure 25.13, left. But in the process of deforma- 
tion these pairs of touching dihedral angles may behave as shown in Figure 25.13, 
right: they either go apart, or penetrate each other. 

Actually, of the two pairs of touching dihedral angles, one behaves in one of the 
ways, and the other one behaves in the other way (this is not important for us, but 
can be easily confirmed by a calculation, or even by an experiment). Anyhow, this 
polyhedron still cannot be deformed without self-intersections. What to do? What 
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Figure 25.12. Two halves of an almost ready Connelly's polyhedron 




Figure 25.13. Deformation of touching dihedral angles 



we want is to remove a small neighborhood of the touching point in (at least) one 
of the two touching dihedral angles (Figure 25.14, left). And this must be done in 
a "polyhedral" way. But we know how to do it: Figure 25.9, right, shows it! The 
arrow points at a small cavity, and all we need is to locate this cavity at and around 
the point E (see also Figure 25.14, right). This completes Connelly's construction 
and the proof of Theorem 25.2. 

It should be noted that the construction shown above is very good for proving 
Theorem 25.2 but not convenient for modeling and demonstration. The constructed 
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Figure 25.14. A final modification of a dihedral angle 

polyhedron consists of 26 faces, of which 24 are triangles and 2 are non-convex 
hexagons. Even if you manage to make a model of this polyhedron, it will admit 
a very small deformation (compared with the size of the polyhedron) , and you will 
never know, whether this deformability is ensured by the mathematical properties 
of the polyhedron or rather by flaws of the material used (the cardboard and the 
tape). However, there are modifications of Connelly's construction which yield 
quite a satisfactory model. We shall discuss these modifications in the next section. 

25.5 Better constructions. After the breathtaking discovery of Connelly, 
many geometers tried to improve his construction. One possibility for an improve- 
ment is obvious: the insert in Figure 25.14 may be made bigger, so that it would 
eat up completely the two faces forming the dihedral angle. This will reduce the 
number of faces to 24, and all of them will be triangular. Then one can observe 
that not all the pyramids are really needed, and this gives rise, eventually, to a 
model with only 18 faces, all of which are triangular. 



C 




Figure 25.15. Cut this out of cardboard and make a model of 
Steffen's polyhedron 

It was a young German mathematician Klaus Stcffcn who found, probably, the 
best possible construction. His polyhedron consists of 14 triangular faces and has 
only 9 vertices. You can make a model using the development on Figure 25.15. 
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A drawing of this polyhedron can be seen in Figure 25.16 (to make the drawing 
better understandable, we notice that the vertex G is located in a cavity surrounded 
by the ridge BDKH) 2 




F 



FIGURE 25.16. The Steffen polyhedron 

We notice in conclusion that, as is seen from Figure 25.15, Steffen's polyhedron 
contains two identical 6-face pieces of Bricard's octahedron similar to those shown 
in Figure 25.8 (the two wings of Figure 25.15), and two more triangular faces (the 
middle part of Figure 25.15) which are attached to both octahedral pieces. These 
three parts of the Steffen polyhedron are shown separately in Figure 25.17. 

These remarks make the flexibility of Steffen's polyhedron less surprising. We 
already know that the two octahedral pieces are deformable, and, certainly, we can 
deform them when they are attached to each other; thus the Steffen polyhedron 
less two faces {ABC and ABD), is deformable. It is not hard to see that this 
deformation does not affect the distance CD which makes deformable the whole 
polyhedron (and the dihedral angle formed by these two faces stays rigid in the 
process of deformation) . 

25.6 The bellows conjecture. There is one more natural problem: does 
the volume inside a flexible polyhedron vary in the process of deformation? There 
are indications that it does not. It is not hard to prove that the modification of 
a dihedral angle shown in Figure 25.14, right, docs not affect the volume of the 
polyhedron. 3 From this, it is easy to deduce that the volume inside the Connelly 
polyhedron is equal to the sum of the volumes of 6 pyramids attached at the first 
step of the construction and does not vary in the process of the deformation. Sim- 
ilarly, the volume of the Steffen polyhedron (Figure 25.16) is equal to that of the 

2 An animated picture of Steffen's polyhedron showing its deformation is available on the 
web, e.g., at www.mathcmatik.com/Stcffcn/. 

3 With a right definition of the volume "inside" a self-intersecting polyhedron (we leave the 
details to the reader), one can prove that the volume inside any Bricard octahedron is zero. 
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Figure 25.17. The Steffen polyhedron disassembled 



tetrahedron ABCD and is also constant. Still there remained a possibility of a 
construction of a flexible polyhedron with a variable volume. However, in 1995, 
I. Sabitov proved that such a construction is not possible; thus, Sabitov proved a 
statement which was called "the bellows conjecture" . Actually, Sabitov's theorem 
states that, given a set of (rigid) faces of a polyhedron, the volume of the polyhe- 
dron can assume only countably many values, [66]. This, certainly, excludes any 
variation of the volume. 

Let us mention, in conclusion, a somewhat paradoxical construction. Start with 
a tetrahedron and deform it, as shown in Figure 25.18. Namely, one subdivides each 
edge of the tetrahedron into two segments and each face into 10 triangles, and then 
pushes the new vertices on the edges inwards. This is an isometric deformation of 
the original tetrahedron. 

The middle parts of the edges are pushed inwards, so one may expect the 
volume to decrease. However, the middle parts of the faces move outwards, and the 
total volume actually increases, by more than a third! Similar volume increasing 
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Figure 25.18. Isometric volume increasing deformation of a tetrahedron 

isometric deformations can be made for all the Platonic solids [7, 12] and even all 
polyhedral surfaces [58]. 

It may appear that this construction contradicts the bellows conjecture. This 
is not the case: the deformation of a tetrahedron depicted in Figure 25.18 is not a 
continuous isometric deformation (the sizes of the small triangles, subdividing the 
faces of the original tetrahedron, change in the process). 



25.7 Exercises. 

25.1. Let ABCD be a non-self-intersecting quadrilateral in the plane. We are 
allowed to deform it in such a way that the lengths of the sides stay constant but 
the angles may vary. 

(a) Prove that we can always deform the quadrilateral ABCD into a triangle. 

(b) Prove that we can always deform the quadrilateral ABCD into a trapezoid. 

(c) Is it always possible to deform the quadrilateral ABCD into a trapezoid 
with AB being one of two parallel sides? 

25.2. Let ABCD be a trapezoid. We want to deform it as in Problem 25.1 in 
such a way that in the process of deformation it remains a trapezoid. Prove that 
it is possible if and only if it is a parallelogram. 



John Smith 
January 23, 2010 



Martyn Green 
August 2, 1936 



Henry Williams 
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25.3. Let AiA 2 . . . A n be an n-gon in the plane. An admissible deformation of 
this n-gon consists in a continuous motion of points A±, A 2 , . . . , A n not affecting 
the distances |AiA 2 |, . . . , |A„_ 1 A„| , |A n Ai|. 

(a) Let n = 4. In terms of the four numbers, |^iA 2 |, |A 2 A 3 |, |A 3 A 4 |, | A 4 yli | , 
find a necessary and sufficient conditions for the existence of an admissible defor- 
mation joining the quadrilateral AiA 2 A 3 A 4 with its mirror image A^A 2 A^A 4 (the 
latter means that there exists a line I such that Ai is symmetric to A\ for all i). 

Hint. See the comment below. 

(b) Do the same for n = 5 and the numbers |j4i A2I, . . . , IA4A5I, | 
Hint. See the comment below. 

Comment. There is a general result due to Kapovich and Milson [43] stating 
that for any n-gon A\A 2 . . . A n the following two sattcmcnts arc equivalent. 

(1) Foranyn-gonAiA 2 ...A; with \A[A!^\ = \A X A 2 \, . . . , K_i<| = |^„-iA„|, - 
there exists an admissible deformation of the n-gon AiA 2 . . . A n into A\A 2 . . . A' n . 

(2) Let ai,a 2 , . . . ,a n be the numbers |AiA 2 |, . . . , |^4„_iA„|, |A„Ai| arranged 
in the non-increasing order: a\ > a 2 > ■ ■ ■ > a n . Then 

a 2 + a 3 < ai + a 4 + ■ ■ ■ + a n 

25.4. Let ABCD be a non-sclf-intcrsecting quadrilateral in the plane, M be 
a point not in this plane. The pyramid MABCD becomes flexible if one removes 
the base ABCD (see Section 25.2). Prove that in the process of deformation the 
points A, B, C, D cannot remain coplanar. 

25.5. A (possibly self-intersecting) polyhedron is called two-sided if one can 
paint the two sides of each face black and white in such a way that the colors 
match at each edge. Otherwise, a polyhedron is called one-sided. 

(a) Prove that a non-self-intersecting polyhedron is two-sided. 

(b) Take a regular (Plato's) octahedron as shown in Figure 25.19, remove the 
triangular faces AMB, BNC, CMD, DNA and add the square faces ABCD, AMCN, BMDN. 
Prove that this construction creates a complete (every edge belongs to precisely 2 

faces) one-sided polyhedron. 



M 



A 




N 



Figure 25.19. One-sided polyhedron made of an octahedron 

(c) Prove that the Bricard octaherdron is two-sided. Prove also that the reflec- 
tion in the axis of symmetry takes white into black and black into white. 

25.6. Let P be a complete two-sided polyhedron. Choose a black and white 
coloring of faces as in exercise 25.5 and choose a plane LT not perpendicular to any 
face of P and not crossing P. For a face F define its underlying volume as the 
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volume of the prism between F and its orthogonal projection onto II (sec Figure 
25.20) multiplied by —1 if the upper face of the prism is white. Define the signed 
volume of P as the sum of underlying volumes of all faces. 




Figure 25.20. Signed volume 



(a) Prove that the signed volume does not depend of II. 

(b) Prove that if P is non-self-intersecting, and the exterior of is painted white, 
then the signed volume is the usual volume. 

(c) Prove that the (signed) volume of the Bricard octahedron is zero. 
Hint. Use the symmetry property from Exercise 25.5 (c). 

In particular, this shows that the bellows conjecture holds for the Bricard oc- 
tahedra. 

(d) Prove that the signed volume of the Steffen polyhedron (Section 25.5) is 
equal to that of the tetrahedron ABCD. Deduce the bellows conjecture for this 
polyhedron. 

25.7. (a)* Consider a smooth family of convex polyhedra P t , where t is a 
parameter, and denote by lj(t) its edge lengths and by (fj(t) the respective dihedral 
angles. Prove that 



£*,•(*)- 



dt 



0. 



(b) The total mean curvature of a polyhedron is defined as J2j lj<Pj- Prove that 
the total mean curvature of a flexible polyhedron remains the same in the process 
of deformation. 
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LECTURE 26 

Alexander's Horned Sphere 

Two properties of the "horned sphere" make it worthy of describing in this 
book. First, it delivers a solution of an important and difficult problem. Second, it 
is really beautiful. 

26.1 Theorems of C. Jordan and A. Schoenflies. A curve in the plane 
is a trace of a moving point. If the starting point of the movement coincides with 
the terminal point, the curve is called closed, if the positions of the moving point 
do not coincide at any two distinct moments of time, the curve is called simple, or 
non-self-intersecting. Jordan's theorem states that a simple closed curve C divides 
the plane into two domains, "interior" and "exterior," in the sense that any two 
points from the same domain may be joined by a polygonal line disjoint from the 
curve C, while any polygonal line joining points from different domains crosses the 
curve (Figure 26.1). 




Figure 26.1. A closed curve divides the plane into two parts 

This theorem is important in analysis, say, in integration theory where we 
need to consider domains bounded by a given simple closed curve; but it leaves 
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unanswered a question belonging rather to topology than to analysis: what the 
parts into which the plane is divided by a simple closed curve, look like. 

In mathematics this frivolous expression "look like" is usually replaced by a 
more rigorous word "homeomorphic" : two domains are homeomorphic, if there 
exists a bijective map of one onto the other such that both this map and its inverse 
are continuous. For example, the interiors of a circle and a square are homeomorphic 
(although one can cay that they look different), while an annulus (the domain 
between two concentric circles) is not homeomorphic to cither of them. 




Figure 26.2. Interior and exterior 



In 1908, A. Schocnflics proved that, whichever simple closed curve in the plane 
one takes, its interior and exterior will be homeomorphic to those of a usual circle: 
to an open disc and a plane with a round hole (Figure 26.2). Similarly, one can 
prove that a domain between two disjoint simple closed curves, of which one is 
contained in the interior of the other one, is homeomorphic to an annulus. 

26.2 Spatial generalizations. One could expect that there should be theo- 
rems in space geometry similar to those of Jordan and Schoenflies - all you need 
is to find right statements. Closed curves, certainly, must be replaced by closed 
surfaces. Here, however, we encounter our first difficulty: while closed curves all 
look the same (homeomorphic), closed surfaces may be essentially different: there 
are spheres, tori, spheres with handles, etc (Figure 26.3). We shall resolve this 
difficulty by brute force: we shall simply ignore all this diversity, restricting our 
attention to surfaces obtained by a continuous deformation, without self-crossings, 
of the usual round sphere. For such surfaces we can hope to establish results similar 
to theorems of Jordan and Schoenflies. 

As to Jordan's theorem, its spatial analog turns out to be true: the surface 
divides space into two parts, interior and exterior, and the statement of the planar 
Jordan theorem is true for them without any changes. The same is true for spheres 
with handles, and there are natural generalizations to arbitrary dimensions. 

But what about the Schoenflies theorem? Its spatial counterpart should state 
that the interior and the exterior domains are homeomorphic to the those of the 
usual sphere, that is, to an open ball and the complement to a closed ball. It was 
an American topologist John Alexander, then very young, who proved in 1924 that 
this conjecture, however plausible it looked, was actually wrong. Alexander's work 
was very convincing: he presented an explicit construction of a deformed sphere in 
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FIGURE 26.3. Different kinds of surfaces 

space which divides space into non-standard parts. The details of his construction 
are described below. 

26.3 It is very beautiful. And very simple. The main ingredient of the 
construction is shown in Figure 26.4: we take two disjoint small disks inside a 
bigger planar disc, pull out of them two "fingers" in such a way that their ends 
come close to each other but do not touch each other. 




FIGURE 26.4. Pulling out fingers 

The ends themselves remain planar discs. We shall usually perform this pulling 
fingers simultaneously from two parallel discs, and the four fingers will form a "lock" 
as shown in Figure 26.5. 




Figure 26.5. The lock 
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Now we can describe the whole construction. Take a round sphere. Then we 
pull out from two discs on this sphere two fingers, almost touching each other, as 
in Figure 26.4 - see Figure 26.6. 




Figure 26.6. The first step: two fingers are pulled out of a sphere 

The ends of the fingers are two parallel and close to each other planar discs; 
from these two discs we pull out four fingers, two from each disc, and lock them as 
in Figure 26.5 - see Figure 26.7. 



FIGURE 26.7. The second step: a lock is inserted between the fingers 

Now we have two pairs of still smaller and still closer discs, and insert small 
copies of Figure 26.5 between the discs of each pair. And so on, infinitely many 
times. What we obtain after this "so on," is what is called "Alexander's horned 
sphere" . It is hardly possible to make a satisfactory drawing of this sphere (be- 
cause the fingers involved become smaller and smaller); still Figure 26.7 provides a 
reasonable visual approximation. 
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26.4 It is an honest sphere. On the first glance, this does cause some 
doubts. 

Alexander's sphere looks like a wcightliftcr's weight with a slot sawed out of the 
handle and a complicated combinations of pieces of wire of different sizes inserted 
in the slot. Is it really true that the ends of the pieces of the handles do not meet? 
Seemingly, they could infinitely approach, and in the limit merge together, since on 
each step of the construction we pull together some parts of the surface. 

No, this danger is purely imaginary. We can perform the previous construction 
in such a way that for every two different points of the sphere, the distance between 
their final positions is not less, than, say, 1% of their initial distance. (We say "say," 
since all the sizes - the lengths of the fingers, the width of the slots, etc. - are not 
specified on Figures 26.4 and 26.5, and we can choose them to our wishes.) 




Figure 26.8. Discs involved in the construction 

Examine our constructions, step by step. The first pulling of fingers involves 
two discs on a round sphere. The rest of the sphere remains untouched during the 
whole construction. The second step involves four smaller discs, two inside each of 
the previous discs (see Figure 26.8). The part of the sphere outside these four discs 
remains untouched by all steps of the construction after the first step. Similarly, 
there are 8 discs involved in the third step (see again Figure 26.8), 16 discs involved 
in the 4-th step (not shown on Figure 26.8), and so on. We shall refer to these discs 
as the discs of sizes 1,2,3,4,...; thus, there are 2" discs of size n, and each disc of 
size n contains precisely two discs of size n + 1. Points of the sphere not contained 
in any disc of size n remain untouched by all steps of our construction starting with 
the n-th step. 

Let us now examine the behavior of the distances between the points. Take 
two different points, A and B, on the sphere. If neither A nor B belongs to cither 
of the two discs of size 1 , then these two points remain unchanged and so does the 
distance between them. If one of them is contained in one of the discs of size 1, and 
the other one is not, then the distance between them is not changed significantly, 
we can assume that even if it decreases, then it decreases not more than thrice. If 
A and B belong to different discs of size 1, then after the first step they become 
significantly closer, but we can assume that the distance between them decreases 
not more than 10 times. If, in addition to that, the two points belong to discs of 
size 2, then the next step makes them much closer, say not more than 10 times 
closer, to each other (see Figure 26.9). However, even if these points belong to 
discs of size 2 or more, they will not become significantly closer after all subsequent 
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FIGURE 26.9. Points A and B do not approach each other 



steps of our construction (see again Figure 26.9). This is because of the following 
fundamental property of the construction: on the n-th step of the construction, we 
pull together only points from the same disc of size n — 1. 

Now we can formulate the general Principle of Distances. Let n be the greatest 
number such that A and B belong to the same disc of size n. Then the distance 
between them remains unchanged under steps 1 through n. If neither of them 
belongs to a disc of size n + 1, then the distance between them remains unchanged 
also under all the subsequent steps of the construction. If only one of the points 
A, B belongs to a disc of size n + 1, then after all the subsequent steps the distance 
between them decreases no more then thrice. If they both belong to discs of size 
n + 1, then under the n + 1-st step the distance between them decreases no more 
than 10 times. If neither of the points belong to discs of size n+2, then the distance 
between them remains unchanged after the n + 1-st step. If precisely one of these 
points belongs to a disc of size n + 2, then the distance between them may decrease 
thrice on the n + 2-nd step and is not changed significantly after this. Finally, 
if both A and B belong to discs of size n + 2, then the distance between them 
decreases at most 10 times on the n + 2-nd step and is not changed significantly 
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after this. In all cases, the distance between A and B decreases not more than 100 
times. 

Thus, Alexander's horned sphere is indeed a sphere, that is, it is "homeomor- 
phic" to the sphere as we stated above. 

26.5 The exterior of the horned sphere. The interior of the horned sphere 
is homeomorphic to the usual ball (without the boundary); it is not hard to prove 
this, but we shall not need it. What is more important is that the exterior of the 
horned sphere is not the same as (not homeomorphic to) that of the usual sphere. 
The proof of this is simple but interesting, since it provides a sample of a topological 
proof. The exterior of the usual sphere (and the interior as well) possesses a property 
which topologists call simply connectedness: every closed curve can be continuously 
deformed to one point. It looks obvious (see Figure 26.10), although a rigorous proof 
of it involves some technicalities. 




Figure 26.10. The exterior of the usual sphere is simply connected 



Homeomorphic domains are simply connected simultaneously: if one is simply 
connected, then the other one should be also simply connected. However, the 
exterior of the horned sphere is not simply connected: a curve enclosing the handle 
of the weight (Figure 26.11) cannot be continuously pulled out of the handle. (To 
pull it out, we shall have to carry it between a pair of close parallel discs of any 
size; hence, in the process of deformation, the curve becomes arbitrarily close to the 
horned sphere, which means that it will touch the sphere at some moment, which 
is prohibited: our deformation should be performed in the exterior of the sphere.) 

Thus, the exterior of the horned sphere is not simply connected and, hence, not 
homeomorphic to the exterior of the usual sphere. This shows that the conjectured 
spatial version of the Schoenflies theorem is false. 

26.6 What else? Now, it is easy to be smart. We could pull the horns not 
outside but inside the sphere; then we get a sphere for which the interior, not the 
exterior, is not homeomorphic to the interior of the standard round sphere. Or we 
could pull two pairs of horns, one inside and one outside the sphere; then both the 



362 



LECTURE 26. ALEXANDER'S HORNED SPHERE 




exterior and the interior of the sphere will be different from those for the usual 
sphere (not simply connected). Or we can pull not two, but, say, twenty two (or 
two hundred twenty two) pairs of horns, some inside and some outside the sphere 
and tangle them with each other in any way. This variety of possibilities does not 
surprise us any longer. 



26.7 Conclusion: further developments. Years and decades passed after 
Alexander's discovery. Still topologists hoped that a spatial version of the Schoen- 
fiies theorem may exists: one only needs to exclude too complicated shapes. What 
if we take polyhedral spheres, that is spheres made out of finitely many polygonal 
pieces of a plane? Even in this case the problem turned out to be very hard. Still, 
in 1960, Morton Brown proved the polyhedral version of the Schocnflics theorem 
(actually, Brown's result holds for a wider class of surfaces, see Exercise 26.1). 

Brown's theorem holds also in higher dimensions. However, in higher dimen- 
sions even polyhedral spheres sometimes provide unexpected surprises. For example 
R. Kirby and L. Siebenmann showed in the 1970's that two polyhedral spheres in 
four-dimensional space, one of which is contained in the interior of the other, may 
cobound a domain different from the standard domain cobounded by two concentric 
round spheres. 

All this, however, goes beyond our technical possibilities. 
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26.8 Exercises. A surface S is called locally flat at a point P <E S, if there 
exists a homeomorphism of a small ball B centered at P onto another ball, B', 
which maps the intersection BC\S onto the intersection of B' with a plane. Brown's 
theorem (see Section 26.7) states that if a surface in space is homeomorphic to a 
sphere and is locally flat at all its points, then it cuts space into two standard parts. 

26.1. At which points is Alexander's sphere not locally flat (describe this set 
on the sphere both before and after the deformation)? Is this set countable or 
uncountable? 

Further exercises are not related directly to Alexander's sphere, but they con- 
cern constructions which have a similar flavor. We begin with the classical Cantor 

set. Take the interval [0, 1]. Remove the middle third, f — , — J . Then remove the 

/ 1 2\ 

middle third from each of the two remaining intervals, that is, remove -, - and 



.9 9, 

7 8 \ 

-, - ] . Then remove the middle third of each of the 4 remaining intervals, and 
9 9 j 

so on. The set obtained as a result of this infinite process is called the Cantor set. 
We denote it by C, and denote its complement by D. 

1 10 19 

26.2. (a) Prove that -, — , — e C. 

(b) More generally, let x = [O.di c^cfe ... ]3 be a presentation of x in the numer- 
ical system with the base 3; express the condition that x belongs to C in terms of 
the digits rfj. 

(\ 2 

Next, we shall define the Cantor function 7: [0, 1] — ► [0, 1]. For x € -,- 

\ o o 

1 /12\/78\ 
put j(x) = -. On the two intervals deleted at the second step, -,- and I — , — J , 

Zi \\j \j J \\j \j j 

13 

we set our function to be equal, respectively, to - and -. On the four intervals 
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13 5 

deleted at the third step, we set the function to be equal, respectively, to - , - , - 

8 8 8 

7 

and -. And so on. This process determines our function on D. 

26.3. (a) Prove that for every x E C there exists a unique y <G [0, 1] such that 
l{ z ) < V for every z G [0, x) fl D and j(z) > y for every z <E (x, 1] n D. 

We put j(x) — y. It is easy to see that 7 is a continuous monotonic function. 

(b) Compute 7 ,7 ^ 

(c) More generally, let x = [O.dic^e^ • • - ]3- Find a presentation of 7(2;) in the 
numerical system with the base 2. 

(d) Prove that if x is rational, then 7(2:) is rational. 

Now we will define the Peano curve in the square [0, l] 2 = {(x, y) | < 
x < 1,0 < y < 1}. Let F: [0,1] -> [0, l] 2 be a continuous curve with F(0) = 
(0, 0), F(l) = (0, 1). The coordinates of F(t) are denoted by (f(t),g(t)). We define 
a curve F : [0, 1] — > [0, l] 2 by the formula 



F(t) 



ff (4t), i/(4t) 



1 



r 



if < t < - 

ifi<i<i 
4 - - 2 

ifl<t<3 
2 - - 4 

if T < * < !• 
4 

In words: we compress the curve F at the scale 1 : 2 and then compose the new curve 
of 4 copies of the rescaled old curve with the appropriate translations, rotations 
and reflections. See Figure 26.12. Starting from the arbitrary curve F as above, 
we apply the transformation described infinitely many times, and denote the limit 
curve as P: [0, 1] — > [0, l] 2 ; this is the Peano curve. 



_(/(4t-l) + l),-(/(4i 
l(/(3-4i) + l),l-^(3-4i) 
I ff (4-4i),l-l/(4-4t) 




Figure 26.12. The Peano curve 



26.4. (a) Prove that the limit exists and is continuous. 

(b) Prove that P does not depend on the initial curve F (provided that it is 
continuous and joins (0,0) with (0,1)). 

(c) Prove that for every (x, y) e [0, l] 2 there exists ate [0, 1] (maybe, not 
unique) such that P(t) = (x, y) (in other words, the Peano curve fills the whole 
square) . 

(d) Findpfi) ,P\ l 
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(e) For t = [0.^1^2(^3 . . . ]2 find (in the numerical system with the base 2) the 
coordinates of F(t). 

(f) Prove that if t is rational, then the coordinates of F(t) are rational. 




LECTURE 27 

Cone Eversion 

27.1 The problem. In the plane with the origin deleted, consider two func- 
tions: fo(x,y) = yjx 2 + y 2 and fi(x,y) = — y // x 2 + y 2 . Their gradients arc the 
constant radial vector fields, from and to the origin, sec Figure 27.1. One can eas- 
ily deform the first field to the second so that no vector of any intermediate field 
vanishes: just rotate each vector 180°. 




FIGURE 27.1. Two radial fields, from and to the origin 

But can one perform such a deformation in the class of nondcgcncratc gradient 
vector fields? In other words, can one include the functions fo(x,y) and fi(x,y) 
into a one-parameter family of smooth functions f t (x,y) without critical points in 
the punctured plane, continuously depending upon the parameter t? This is the 
problem that we shall discuss in this lecture. 

The problem is less innocent that it might appear at first glance. Simply looking 
at the flow-lines of a vector field, it is hard to tell whether this is a gradient field 
of some function. For example, the field in Figure 27.2 is not a gradient: it does 
non-zero work along circles centered at the origin. (Recall, from physics, that the 
work done by a force F along a curve 7 is the line integral F ■ ds. The work done 
by a conservative force, that is, the gradient of a potential function, along a closed 
curve is always zero.) 
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FIGURE 27.2. This is not a gradient field 



We can formulate the problem more geometrically. Let S t be the graph of the 
function f t (x,y). The surfaces Sq and Si are cones, see Figure 27.3. One wants 
to deform one cone to the other (in the class of surfaces whose projection on the 
punctured plane is one-one, that is, the graphs of functions defined in the punctured 
plane) in such a way that no intermediate surface St has a horizontal tangent plane 
at any point. Indeed, the tangent plane is horizontal precisely when the function 
has a critical point, and its gradient vanishes. 



Figure 27.3. Can the left cone be everted to the right one without 
being horizontal anywhere at any time? 

In the next section we shall construct such a cone eversion. 

27.2 A solution. First, it is more convenient to deal with an annulus, say, 
1 < \J x 2 + y 2 < 3, than the punctured plane. The smooth mapping, given in polar 
coordinates by the formula 



identifies the annulus and the punctured plane, and a solution of our problem in 
one domain yield a solution in the other. 

Here is an explicit deformation. The surface S t is given by the equation 





z t {a,r) = ft (a) + u.25(r-2) h t (a) 
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in cylindrical coordinates (a, r, z); here < a < 2ir, 1 < r < 3 and t varies from 
to 1. The functions g t and h t are as follows: 

fft = 4tsina, h t = (1 - 2t) + 4icosa, for te [0,0.25]; 

.g t = 2(1 -2t)sina+ (At- 1) sin 2a, h t = cos a + (1 - 2t), for t e [0.25,0.5]; 

fft = -2(2f- l)sina+ (3-4t)sin2a, h t = cos a - (2t - 1), for i e [0.5, 0.75]; 

5t = -4(1 -i)sina, h t = -(2t - 1) + 4(1 - t) cos a, for ie [0.75,1]. 

One could stop here: formulas are written, and an industrious reader is welcome 
to check that the surfaces St have no horizontal tangent planes anywhere (that 
is, the functions z t have no critical points). But of course we owe the reader 
explanations. 

Let us explain the genesis of these formulas. Since the original and the terminal 
functions are linear in r, it is natural to look for the function z t in the form: 

z t (a,r) = g t (a) + e(r - 2) h t (a), 

where g and h are periodic functions and e is a sufficiently small parameter (its 
actual value in our formulas is 0.25). The original cone corresponds to go (a) = 
and ho(a) = const > 0; the terminal cone - to g\(a) — and h\(a) = const < 0. 

It might be instructive to think of the surface S t as a closed rope ladder in 
space whose axis is the closed curve 

z = 9t(a), 0<a<27r, r = 2, 

and whose rungs are the radial segments 

z = gt(ot) + e(r — 2) h t (a), a = const, 1 < r < 3 

with the slope sht(a). So, at the beginning, the axis is a horizontal circle and the 
slopes of all the rungs are positive. At the end, the axis is again a horizontal circle, 
but the slopes of the rungs are all negative. 

What one wants to avoid in the deformation are the instances when the axis 
and the rungs are simultaneously horizontal. Thus the functions 

dgt{a) . , ~,dh t {a) 

— : heir -2) — : and hAa) 

da da 

should have no common zeroes. If, for some t, the zeroes of 

dg t (a) 



da 

are disjoint, then so are the zeroes of 



and h t (a) 



— heir -2) — and h t (a) 

da da 

for a sufficiently small e. 

The strategy is clear now. First, change the shape of the axis of the rope 
ladder (i.e., the graph of g(a)) into a non-horizontal curve, after which one can 
safely change the slope of the rungs (the sign of h(a)) from positive to negative on 
its non-horizontal segments. 

The graphs of gt{a) are sketched in Figure 27.4 on the left. The graphs are 
drawn in solid or broken lines; the former means that h t (a) is positive, and the 
latter, that it is negative at the corresponding points a. The half-way picture, 
t = 1/2, is symmetric with respect to the time eversion: t i— > 1 — t; from that 
point on, one just repeats the process backward. Figure 27.4, right, shows the 
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t = 




t = 4 



Figure 27.4. The deformation: geometry behind the formulas 

corresponding deformation of the gradient vector fields, thus answering the original 
question. 

Figure 27.5 shows level curves of the functions z t for t = 1/8,1/4,3/8 and 1/2, 
and Figure 27.6 the whole deformation, from beginning to end. 

27.3 Comments. The existence of a deformation, explicitly constructed in 
Section 27.2, follows from the /i-principle theory, an actively developing chapter 
of differential topology. In fact, the existence of such a deformation is the first 
application of Gromov's /i-principle, discussed in the book [27] (section 4.1); con- 
structing an explicit deformation is an exercise in this book. Our construction is 
based on the article [80]. As far as we know, the problem, discussed in this lecture, 
was posed by M. Krasnosel'skii in his lecture titled "Mathematical divertissement" 
in the 1970s. 

One of the early precursors of the ^-principle theory was the Whitney theorem, 
discussed in Lecture 12; a sphere eversion, mentioned in the same lecture, is another 
manifestation of this theory (more precisely, the Smalc-Hirsch immersion theory). 





Figure 27.5. Level curves of the functions z t for t = 1/8, 1/4, 3/8 
and 1/2 



Let us mention another famous result, the Nash-Kuipcr theorem concerning dif- 
ferentiable, but not twice differentiable, maps. We shall not formulate the general 
theorem but instead mention one of its striking consequences: one can differentially 
and isometrically embed a unit sphere into an arbitrarily small ball (this would be 
impossible if the isometric embedding was twice differentiable)! 
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Figure 27.6. Cone eversion, from beginning to end 
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27.4 Exercises. 

27. 1 . (a) Construct a function of two variables that has two local maxima and 
no local minima or saddle points. Draw level curves and gradient lines of this 
function. 

(b)* Can one take a polynomial for such a function? 

27.2. (a) Construct a function of two variables that has only one local minimum 
and no other critical points and such that this local minimum is not an absolute 
minimum. Draw level curves and gradient lines of this function. 

(b)* Can one take a polynomial for such a function? 

27.3. ** Let f(x, y) = cos x cos y. The critical point of the function / form the 
lattice (nk/2,irl/2) with k + I even. Consider two nondegenerate vector fields in 
the complement to the lattice: the gradient of f(x, y) and the gradient of —f(x, y). 
Construct a continuous deformation of one field to the other in the class of nonde- 
generate gradient vector fields. 

Comment. A similar fact holds for any smooth function with isolated local 
maxima, minima and saddle points. 

27.4. Construct a polynomial of two variables whose range is the interval 

(0,oo)? 

Comment: essentially, this problem was given on the Putnam Competition in 
1969. Only 1% of the contestants got a score of 8,9 or 10 for this problem. When 
the examination was printed it was believed that such a polynomial did not exist. 
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LECTURE 28 

Billiards in Ellipses and Geodesies on Ellipsoids 



28.1 Plane billiards. The billiard system describes the motion of a free point 
inside a plane domain: the point moves with a constant speed along a straight 
line until it hits the boundary, where it reflects according to the familiar law of 
geometrical optics "the angle of incidence equals the angle of reflection" . Equally 
well, one may imagine rays of light reflecting in the boundary which is a perfect 
mirror. We shall consider billiard tables bounded by smooth convex closed curves 
7- 

The billiard reflection is a mapping that sends the incoming billiard trajectory 
to the outgoing one. We shall denote this "billiard ball map" by T. The map T 
acts on oriented lines that intersect the billiard table; if a line is tangent to the 
boundary, then T leaves it intact. 

One can characterize an oriented line by its two points of intersection with the 
boundary curve 7. The map T sends xy to yz, see Figure 28.1. 

The reflection law can be interpreted as a solution to an extremal problem. 1 
Fix points x and z and let y vary. 

Lemma 28.1. The angles made by lines xy and yz with 7 are equal if and only 
if y is an extremal point of the function \xy\ + \yz\: 

d(\xy\ + \yz\) 



(28.1) 



dy 



0. 



Proof. Assume first that y is a free point, not confined to 7. The gradient of 
the function \xy\ is the unit vector from x to y, and the gradient of \yz\ is the unit 
vector from z to y. Indeed, if, say, xy is an elastic string, fixed at one endpoint x, 
then the other endpoint y will move directly toward x with unit speed. 

Point y, confined to 7, is a critical point of the function \xy\ + \yz\ if and only 
if the sum of the two gradients is orthogonal to 7 (this is the Lagrange multipliers 



x As many other laws of physics. The one describing propagation of light is called the Fermat 
principle: light "chooses" the trajectory that takes least time to get from one point to another. 
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Figure 28.1. Billiard ball map 



principle, familiar from calculus). This is equivalent to the fact that xy and yz 
make equal angles with 7. □ 

In the language of mechanics, the point (or, better, a bead) y on the "wire" 7 
is acted upon by two unit forces exerted by the elastic string: from y to x and from 
y to z. The point y is in equilibrium if the total force is perpendicular to 7. 

It is mentioned in Lecture 19 that the billiard ball map admits an invariant 
area element; this is the area on the space of oriented lines in the plane, which is the 
main character of Lecture 19. Now we shall deduce this fact from the variational 
principle of Lemma 28.1. 

Consider an arc length parameterization of 7 and let x and y be two values of 
the parameter, that is, two points on the curve. Then (x,y) are coordinates on the 
space of oriented lines intersecting the billiard table. 

Theorem 28.1. The area element 
(28.2) u(x,y) = ^^dxdy 

is invariant under the billiard ball map T. 

A necessary clarification before the proof: dxdy is the oriented area of an 
infinitesimal box whose sides are parallel to the coordinate axes and have lengths 
dx and dy. In this sense, dydy — 0, and dxdy — —dydx. 2 

Proof. Differentiate equation (28.1): 

& 2 \xy\ , d 2 \xy\ , d 2 \yz\ , d 2 \yz\ , 

JT7T dx + ~^~1T~ d V + -JT2 1 d V + -7T7T dz = °' 
Oxoy ay 1 oy z oyoz 

and multiply by dy. Taking into account that dydy = and dzdy = —dydz, we 
obtain: 

d 2 \xy\ d 2 \yz\ 
— — — dxdy = dydz. 
oxoy oyoz 



2 The technically correct notation is dx A dy; the wedge product is skew-symmetric. 
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The last equation means that u(x,y) = u>(y,z), as claimed. □ 

28.2 Optical properties of conies. Geometrically, an ellipse is defined as 
the locus of points whose sum of distances to two given points, F\ and F 2 , is fixed; 
these two points are called the foci. An ellipse can be constructed using a string 
whose ends are fixed at the foci, see Figure 28.2. This is sometimes called the 
"gardener" or "string construction" of an ellipse. A hyperbola is defined similarly, 
with the sum of distances replaced by the absolute value of their difference; and a 
parabola is the locus of points at equal distances from a point (focus) and a line. 



X 




Fi F 2 



FIGURE 28.2. String construction of an ellipse: |i*\A| + |F 2 X| = const 

The following optical property of conies was known to the ancient Greeks. 

Lemma 28.2. A ray from one focus of an ellipse reflects to a ray through the 
other focus. 

Proof. We want to prove that the angles made by F\X and F 2 X with the ellipse 
in Figure 28.2 are equal. 

Assume that X is free to vary in the plane. The ellipse is a level curve of the 
function f{X) = \F\X\ + \F 2 X\, and the gradient of this function at point X is 
orthogonal to the curve. Similarly to the proof of Lemma 28.1, the gradient of 
f(X) is the sum of the two unit vectors, having directions F\X and F 2 X. This 
sum is orthogonal to the curve if and only if the vectors make equal angles with it, 
as claimed. □ 

Likewise, a ray from the focus of a parabola reflects to a ray parallel to its axis, 
see Figure 28.3 - see Exercise 28.3. This optical property of parabolas has numerous 
applications in designs of projectors, flashlights, and other optical devices, and we 
already used it in Lecture 15. 




Figure 28.3. Optical property of parabola 
Consider an ellipse and a hyperbola with the same foci passing through point 

X. 

LEMMA 28.3. The ellipse and the hyperbola are orthogonal to each other. 
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Proof. The hyperbola is a level curve of the function g(X) = \F\X\ — \F2X\, 
whose gradient at point X is the difference of the two unit vectors, having directions 
FiX and F 2 X. The difference of two unit vectors is orthogonal to their sum, hence 
the curves are perpendicular to each other. □ 

The construction of an ellipse with given foci has a parameter, the length of 
the string. The family of conies with fixed foci is called confocal. The equation of 
a confocal family, including ellipses and hyperbolas, is 

< 28 ' 3 > + 

where A is a parameter. 

The next result generalizes Lemma 28.2 to billiard trajectories not through the 

foci. 

Theorem 28.2. A billiard trajectory inside an ellipse forever remains tangent 
to a fixed confocal conic. More precisely, if a segment of a billiard trajectory does not 
intersect the segment F1F2, then all the segments of this trajectory do not intersect 
F1F2 and are all tangent to the same ellipse with foci F\ and F 2 ; and if a segment of 
a trajectory intersects F\F 2 , then all the segments of this trajectory intersect FiF 2 
and are all tangent to the same hyperbola with foci i*\ and F 2 ■ 

Proof. Let AqA\ and A\A 2 be consecutive segments of a billiard trajectory, see 
Figure 28.4. Assume that AqA\ does not intersect the segment F\F 2 (the other 
case is dealt with similarly). It follows from the optical property of an ellipse, that 
the angles made by segments F1A1 and F 2 Ai with the ellipse are equal. Likewise, 
the segments A0A1 and A 2 Ai make equal angles with the ellipse. Hence the angles 
A AiFi and A 2 A\F 2 are equal. 



Figure 28.4. Proof of Theorem 28.2 



Reflect Fi in A0A1 to point F[, and F 2 in AiA 2 to F 2 . Let B be the intersection 
point of the lines F[F 2 and A A±, and C of the lines F 2 Fi and A\A 2 . 

Consider the ellipse Y\ with foci F\ and F 2 that is tangent to the line A$A\. 
Since the angles F 2 BA\ and F[BA n are equal, and so are the angles F[BA and 
FiBA , the angles F 2 BA\ and F\BAq are equal. By the optical property of ellipses, 
the ellipse Y\ touches A^A\ at the point B. Likewise the ellipse T 2 with foci F\ and 
F 2 touches A\A 2 at the point C. One wants to show that these two ellipses coincide 
or, equivalcntly, that F X B + BF 2 = F 1 C+CF 2 , which boils down to F[F 2 = FiF 2 . 
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We claim that the triangles F[A\F 2 and FiAiF 2 are congruent. Indeed, F[A\ = 
F\Ai and F 2 Ai = F' 2 A X by symmetry. In addition, the angles F[AiF 2 and F\A\F 2 
are equal: the angles AqAiFi and A 2 A^F 2 are equal, hence so are the angles F[AiF\ 
and F 2 A\F 2 , and adding the common angle FiA\F 2 implies that /LF[A\F 2 = 

Equality of the triangles F{AiF 2 and F 1 A 1 F 2 implies that F[F 2 — F\F 2 , and 
we are done. □ 

As an application, here is a construction of a room with reflecting walls that 
cannot be illuminated from any of its points; this construction is due to L. and 
R. Penrose [60] - see Figure 28. 5. 3 The upper and the lower curves are half-ellipses 
with foci Fi,F 2 and Gi,G 2 . Since a ray passing between the foci reflects back again 
between the foci, no ray can enter the four "ear lobes" from the area between the 
lines F\F 2 and G\G 2 , and vice versa. Thus if the source of light is above the line 
G\G 2 , the lower lobes are not illuminated; and if the source is below F\F 2 , the 
same applies to the upper lobes. 




FIGURE 28.5. This room cannot be illuminated from a single point 



28.3 Caustics, string construction and the Graves theorem. A caustic 4 
is a curve inside a billiard table such that if a segment of a billiard trajectory is 
tangent to this curve, then so is each reflected segment. We assume that caustics 
are smooth and convex. 

There is a string construction, similar to the string construction of ellipses, 
Figure 28.2, that recovers a billiard table from its caustic: wrap a closed non- 
stretchable string around the caustic, pull it tight at a point and move this point 
around to obtain the boundary of a billiard table, see Figure 28.6. 

Theorem 28.3. The billiard inside 7 has T as its caustic. 



3 Roger Penrose is leading contemporary mathematical physicist. Lionel Penrose, his father, 
was a prominent psychiatrist and geneticists. 

4 Caustic also has a different meaning: the envelop of the normal lines to a curve; we used it 
in Lecture 10. 
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Figure 28.6. String construction: recovering a billiard table from 
a caustic 

Proof. Choose a reference point Y on T. For a point X, let f(X) and g(X) 
be the distances from X to Y by going around T on the right and on the left, 
respectively. Then 7 is a level curve of the function f(X) + g(X). We want to 
prove that the angles made by the segments AX and BX with 7 are equal. 

We claim that the gradient of / at X is the unit vector in the direction AX. 
Indeed, the free end X of the contracting string Y AX will move directly toward 
point A with unit speed (compare with the proofs of Lemmas 28.1 and 28.2). It 
follows that the gradient of / + g bisects the angle AXB. Since the gradient of a 
function is orthogonal to its level curve, AX and BX make equal angles with 7. □ 

Note that the string construction provides a one-parameter family of billiard 
tables: the parameter is the length of the string. 

As a consequence of Theorem 28.2, one obtains the following Graves theorem: 
wrapping a closed non-stretchable string around an ellipse produces a confocal el- 
lipse, see [63] for other proofs. 

28.4 Geometrical consequences. The space of oriented lines, that intersect 
an ellipse, is, topologically, a cylinder. This cylinder is foliated by invariant curves 
of the billiard ball map, see Figure 28.7. Each curve represents the family of rays 
tangent to a fixed confocal conic. The oo-shaped curve corresponds to the family 
of rays through the two foci. The two singular points of this curve represent the 
major axis with the two opposite orientations, a 2-periodic, back-and-forth billiard 
trajectory. Another 2-pcriodic trajectory is the minor axis represented by two 
centers of the regions inside the oo-shaped curve. 

Consider the invariant curves that go around the cylinder; they represent the 
rays tangent to confocal ellipses (other invariant curves, the ones inside the oo- 
shaped curve, represent the rays, tangent to confocal hyperbolas). 

Theorem 28.4. One can choose a cyclic coordinate on each invariant curve, x 
mod 1, in such a way that the billiard ball map is given by the formula T(x) = x + c 
(the value of the constant c depends on the invariant curve). 

Proof. The construction of the desired coordinate x depends on the two struc- 
tures available to us: the family of invariant curves and the area element lo (28.2) 
on the cylinder. 

Choose a function / on the cylinder whose level curves are the invariant curves 
of the billiard ball map. Let 7 be a level curve / = a. Consider the nearby level 
curve 7 e given by / = a+e. For an interval / C 7, consider the area uj(I, s) between 
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Figure 28.7. Phase portrait of the billiard ball map in an ellipse 

7 and j E over /; clearly, this area tends to zero as e — > 0. Define the "length" of / 
as 



Choosing a different function /, one replaces the infinitesimal e by another one, say, 
5; then the length of every segment is multiplied by the same factor 8/e. Choose 
a coordinate x so that the length element is dx and normalize x so that the total 
length is 1 . This determines i up to a shift x h- > x + const. 

The billiard ball map T preserves the area element uj and the invariant curves. 
Therefore it preserves the length element on the invariant curves, and hence it is 
given by the formula xm: + con each invariant curve (of course, the value of the 
constant c depends on the invariant curve). □ 

The first consequence is a closure theorem for billiard trajectories in an ellipse, 
cf. Lecture 29. 

COROLLARY 28.5. Assume that a billiard trajectory in an ellipse j, tangent to 
a confocal ellipse T, is n-periodic. Then every billiard trajectory in 7, tangent to 
r , is n-periodic. 

Proof. Consider the invariant curve that consists of the rays tangent to T. In 
the coordinate x from Theorem 28.4, the billiard ball map is x ^ x + c. A point 
is n-periodic if and only if nc is an integer. This condition does not depend on x, 
and the result follows. □ 

Let 71,72 and T be confocal ellipses, see Figure 28.8. One has two billiard ball 
maps, T\ and T 2 , corresponding to reflections in 71 and 72. Both maps act on the 
same space of oriented lines that intersect both ellipses, and they share a caustic, 
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FIGURE 28.8. Commuting billiard ball maps in confocal ellipses 



r. The choice of the parameter x on the invariant curve, corresponding to this 
caustic, depended only on the area element in the space of oriented lines and the 
family of confocal ellipses, which are the same for both maps. We arrive at the 
next corollary. 

Corollary 28.6. The maps T\ andT-i commute: T10T2 = T-2,oT\ (see Figure 
28.9 for a resulting configuration theorem). 

Proof. Parallel translations x ^ x + c\ and x ^ x + c 2 commute. □ 




Figure 28.9. The most elementary theorem of Euclidean geometry 



In the degenerate case, when T is the segment connecting the foci, one obtains 
the following "most elementary theorem of Euclidean geometry": 5 

AB + BF = AD + DF if and only if AC + CF = AE + EF, 

see Figure 28.9. Indeed, points B and D lies on an ellipse with foci A and F if and 
only if so do points C and E. 



5 Discovered by M. Urquhart, 1902-1966, an Australian mathematical physicist; it was later 
found out that this theorem was published much earlier by De Morgan, in 1841. This is another 
manifestation of M. Berry's Law, mentioned in Lecture 15. 



LECTURE 28. BILLIARDS IN ELLIPSES AND GEODESICS ON ELLIPSOIDS 385 

28.5 Elliptic coordinates. Let us return to the confocal family of conies 
(28.3). Through a generic point P(x,y) there passes an ellipse and a hyperbola 
from this family (the point P should not lie on the segment connecting the foci; 
this is the general position assumption in this case). Let Ai and A2 be the respective 
values of the parameter A. Then (Ai, A2) are called the elliptic coordinates of the 
point P. The ellipses and hyperbolas from the confocal family (28.3) play the role 
of coordinate curves of this coordinate system; they are mutually orthogonal, see 
Figure 28.10. 



FIGURE 28.10. Confocal ellipses and hyperbolas 

Consider now an ellipsoid M in space 

x 2 y 2 z 2 

h - — I = 1, 

a 2 b 2 c 2 

and assume that all semiaxes a, b, c are distinct: < a < b < c. The confocal family 
of quadratic surfaces M\ is defined by the equation 

x 2 y 2 z 2 

^ ^Ta + PTa + ^Ta^ 1 

where A is a real parameter. The type of the surface M\ changes as A passes the 
values — b 2 and —a 2 : for — c 2 < A < — b 2 , it is a hypcrboloid of two sheets, for 
— b 2 < A < —a 2 , a hyperboloid of one sheet, and for —a 2 < A, an ellipsoid; see 
Figure 28.11. 

Similarly to the plane case, we introduce elliptic coordinates of a point (x, y, z) 
as the three values of A for which equation (28.4) holds. A justification is provided 
by the following theorem. 

Theorem 28.7. A generic point P = (x, y, z) is contained in exactly three 
quadratic surfaces, confocal with the given ellipsoid. These confocal quadrics are 
pairwise perpendicular at point P, see Figure 28.12. 
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Figure 28.11. Confocal quadratic surfaces 



Proof. Given a point P, equation (28.4) can be rewritten as a cubic equation in 
A. We want to show that it has three real roots. Indeed, the graph of the left-hand 
side, as a function of A, looks as depicted in Figure 28.13. Therefore this function 
assumes value 1 three times (assuming that xyz ^ 0: this suffices for the present 
argument; in general, we need to assume that the discriminant of the cubic equation 
in A does not vanish). Let (Ai,A 2 ,A 3 ) be the roots. 

Next, we want to show that the quadrics are pairwise orthogonal at point P. 
Consider, for instance, M\ 1 and M\ 2 . A normal vector to M\ 1 at point P is the 
gradient of the function on the right hand side of (28.4) (we divide it by 2 for 
convenience) : 



Consider equations (28.4) for Ai and A 2 . The difference of their left-hand sides is 
equal to the right-hand side of (28.5) times (Ai — A 2 ). Hence this right-hand side 
is zero, and Ni ■ N 2 = 0, as claimed. □ 

28.6 Apparent contours and Chasles' theorem. Our goal in this section 
is to prove the following theorem, due to Chasles. 

Theorem 28.8. A generic line in space is tangent to 2 distinct quadratic sur- 
faces from a given confocal family. The tangent planes to these quadrics at the 
points of tangency with the line are orthogonal to each other. 

Let t be the line. The strategy of the proof is to project space along I to the 
orthogonal plane. A generic orthogonal projection of a surface on the plane (screen) 
is a domain bounded by a curve, the apparent contour (or, simply, the shadow) of 
the surface. The apparent contour is the locus of intersection points of the screen 
with the lines, parallel to I and tangent to the surface. For example, the apparent 
contour of a convex surface is an oval. 

The projection of the family of confocal quadrics yields a 1-paramctcr family 
of apparent contours. 




and likewise for N 2 . Hence 



(28.5) Ni ■ N 2 = 




(a 2 + A!)(a 2 + A 2 ) " (6 2 + Ai)(6 2 + A 2 ) " (c 2 + A^c 2 + A 2 ) " 
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Figure 28.12. Three pairwise perpendicular confocal quadratic 
surfaces, transparent and opaque 

Proposition 28.4. The apparent contours of confocal quadrics is a family of 
confocal conies. 

This proposition implies the Chasles Theorem. 

Proof of Theorem 28.8. The projection of the line £ is a point. Through this 
point, there passes an ellipse and a hyperbola from a confocal family, and they 
are orthogonal to each other. Each of the two curves is the apparent contour of a 
quadratic surface from the given confocal family. Therefore these two surfaces are 
tangent to i and are orthogonal at the tangency points. □ 
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FIGURE 28.13. The graph of the equation of a confocal family 



Proof of Proposition 28.4. First of all, it is easy to show that the apparent 
contour of an individual quadratic surface is a conic. 

Assume that the screen is the horizontal (x, y)-plane and the line I is ver- 
tical. Our quadratic surface M is given by a quadratic equation in x,y,z; its 
specific form is not important to us (this equation is a combination of 10 terms: 
x ,y 2 , z 2 ,xy,yz, zx,x,y, z and constants). For a given point of the screen (x,y), 
the vertical line through this point has the parametric equation (x,y,t). If we 
substitute this into the equation of M, we obtain a quadratic equation in t: 

(28.6) p2(x, y)t 2 + Pl {x, y)t + p (x, y) = 

where pi is a constant, p\ is a linear and p 2 {x, y) a quadratic function of x, y. 

The apparent contour of M consists of those points (x, y) for which the vertical 
line through this point is tangent to M, that is, when equation (28.6) has a multiple 
root. This happens when the discriminant equals zero: 

pi(x, y) 2 - 4p 2 (x, y) Po {x, y) = 0. 

This is a quadratic equation in x and y, and it describes a conic on the screen. 

It takes an extra work to prove that the apparent contours of a confocal family 
of quadrics is a confocal family of conies. 

As we know, the normal vector to the quadric M\ at point P(x, y, z) is 

"' p » = ' i '^ = (^'^^)' 

We have chosen the magnitude of the normal vector in such a way that N{P)-P = 1; 
this equation holds due to (28.4). 

As P varies over M\, the point N(P) describes the quadric M\ given by the 
equation 

(28.7) (a 2 + \)x 2 + (b 2 + \)f + (c 2 + \)z 2 = 1, 
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the latter equation being just another form of (28.4). Such a family of quadratic 
surfaces is called linear: the left hand side can be written as Qi + XQ 2 where Q\ 
and Qi are quadratic forms 

2-2 i 1.2-2 i 2-2 j -2,-2,-2 

ax +by +cz and x + y + z . 



Denote the screen by W. A line, parallel to £, is tangent to M\ at point P 
if and only if the normal N(P) is orthogonal to I , that is, if N(P) is parallel to 
W. Note that these vectors N(P) are the normals to the apparent contour of the 
surface M\, see Figure 28.14. 




Figure 28.14. A surface and its apparent contour 



The set of such vectors N is the intersection curve of the quadratic surface M\ 
with the plane W. This curve is a conic given, in appropriate Cartesian coordinates 
(£,?7) on W, by a formula, similar to (28.7): 



(28. 



(a 2 + X)e + (/3 2 + X)r] 2 = 1. 



Thus the normals to the apparent contours of the surfaces M\ form a linear family 
of conies in the plane W. 

In the plane, by the same token, it is also true that the normals to a confocal 
family of conies constitute a linear family of conies. It follows that these apparent 
contours form a confocal family on the screen. □ 



28.7 Geodesies on ellipsoids. Let M be a surface. A geodesic curve on 
M is a trajectory of a free particle confined to stay on M. If j(t) is an arc length 
parameterization of a geodesic, then the acceleration vector j"(t) is orthogonal to M 
(physically, this means that the only force acting on the point is the normal force 
that confines the point to M). Geodesies locally minimize the distance between 
their points. For example, a geodesic on a developable surface becomes a straight 
line after the surface is unfolded to a plane. The geodesies on a sphere are its great 
circles. See Lecture 20 for a more detailed discussion. 

Let M be an ellipsoid. The behavior of geodesies is very regular; it is described 
by the following theorem, due to Chasles and Jacobi. This result is one of the 
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great achievements of 19th century mathematics. We assume that the ellipsoid has 
distinct axes, so that the confocal family of quadratic surfaces is defined. 6 

Theorem 28.9. The tangent lines to a geodesic on M are tangent to another 
fixed quadratic surface, confocal with M . 

Proof. Consider an arc length parameterized geodesic curve 7(f) on M, and 
let ((t) be the straight line, tangent to this geodesic at point 7(f). By Theorem 
28.8, ((t) is tangent to another quadratic surface, M\u\, confocal with M and 
corresponding to parameter X(t) in equation (28.4). We want to prove that X(t) is 
independent of t, that is, 

m.o. 

Fix a value of t, say, t = 0. Let N be a normal vector to M at point 7(0) and 
let 7r be the plane spanned by N and the line £(0). Consider a close point 7(e). 
Since the acceleration vector of the geodesic 7 is orthogonal to M, the line ((e) lies 
in the plane ir, up to an error of order e 2 . 

Indeed, using the "Big O" notation, one has: 

7(e) = 7(0) + e 7 '(0) + 0(e 2 ), j{e) = 7 '(0) + ^'(O) + 0(e 2 ). 

The point 7(0) and the vectors 7'(0), Y'(0) lie in the plane n. Hence, up to an error 
of order e 2 , a point 7(e) of the line ((e) and its directional vector V(e) lie in ir. 

Let y be the tangency point of the line ((0) with Mw )- By Theorem 28.8, the 
normal vector N lies in the tangent plane to Mx(o) a t Ui that is, this tangent plane 
is the plane ir, see Figure 28.15. Denote by m(e) the orthogonal projection of ((e) 
on the plane tt. Note that m(e) and ((e) are close to order e 2 . Therefore, as far as 
equality (28.9) is concerned, we may replace ((e) by m(e), that is, to assume that 
the line ((e) lies in the plane n. 



Figure 28.15. Proving Theorem 28.9 



6 Of course, the regular behavior of geodesies,, described in Theorem 28.9, extends, by conti- 
nuity, to ellipsoids with coinciding axes, that is, ellipsoids of revolution. 
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We want to prove that A(e) — A(0) = 0(e 2 ). Intuitively, this is clear: the line 
1(e) lies on the tangent plane to the surface M X (o) an d 1S e 2 -close to this surface. 
To make an exact sense of this argument, we need a technical lemma. 

Let f(x, e) be a smooth function of two variables; we think of this as a family of 
functions in variable x, with e being a parameter, and use the suggestive notation 
f E (x). Assume that the function fo(x) has a critical point at x = 0, and this critical 
point is non-degenerate: /q'(0) ^ 0. Assume also that the respective critical value 
is zero: /o(0) = 0. Then, for every sufficiently small e, the function f e (x) has a 
critical point near x = 0; let c(e) be the respective critical value, see Figure 28.16. 




Figure 28.16. Lemma 28.5 



Lemma 28.5. 



(28.10) 




t(e) 

Figure 28.17. Proof of Lemma 28.5 

Proof of Lemma. Expand f e (x) in a series in e: 

(28.11) f e (x) = h(x)+eg(x) + 0(e 2 ). 

Let t(e) be the critical point of the function f £ (x) near zero; since t(0) = 0, one 
has: t(e) = 0(e). It follows from (28.11) that 

c(e) - / e (i(e)) = /o(t(e)) + eg(t(e)) + 0(e 2 ). 

Since /o has a critical point at x = with zero critical value, fo(x) = 0(x 2 ), and 
hence f (t(e)) = 0(e 2 ). Also g(t(e)) = g(0) + O(e). It follows that c(e) = eg(0) + 
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0(s 2 ). But (28.11) implies that / £ (0) = eg(0) + O(e 2 ). Hence c(e)-f e (0) = 0(e 2 ), 
and (28.10) follows. See Figure 28.17. □ 

Now we can finish the proof of Theorem 28.9. Assume that the tangency of 
line £(0) with the surface M X (o) is non-degenerate: for a quadratic surface, this 
means that the line does not lie on the surface (cf. Lecture 16). We shall prove 
the statement of the theorem for such a generic line, and then (28.9) extends to all 
lines by continuity. 

Recall that the surface M\/m from the confocal family (28.4) passes through 
point y. Then, through every point in a vicinity of y, there passes a quadratic 
surface from this confocal family, and we consider the corresponding elliptic coor- 
dinate A as a function, defined in a neighborhood of y. In particular, the value of 
A at y is A(0). 

We can restrict the function A on a straight line. A line is tangent to a quadratic 
surface M c when the restriction of the function A on this line has a critical point 
with the critical value c. Identify all the lines, sufficiently close to £(0), with the 
real line, assuming that the origin on £(0) is at point y. Let x be the variable on 
R. Subtract A(0) from the function A and denote its restriction on the line £{e) by 

fe{x). 

Now we apply Lemma 28.5. Since the line ^(0) is tangent to M\(o), the function 
fo{x) has a non-degenerate critical point at x — with zero critical value. The 
distance between the origins on the lines £(0) and £{e) is of order e (or higher). 
Since the line £(e) lies in the tangent plane 7r to the level surface {A = A(0)}, the 
distance from the origin on this line to this surface is of order e 2 or higher. That is, 
/ e (0) = 0(e 2 ). By Lemma 28.5, lim e ^ c(£)/e = where c(e) = A(e) - A(0), and 
(28.9) follows. □ 

Theorem 28.9 imposes very strong restrictions on the behavior of geodesies on 
ellipsoids. The lines, tangent to a fixed geodesic 7 on M, are tangent to another 
quadric Q, confocal with M. Let 1 be a point of 7. The tangent plane to M at x 
intersects Q along a conic (depending on x). The number of tangent lines to this 
conic from x can be equal to 2, 1 or (the intermediate case of a single tangent 
line, having multiplicity 2, occurs when x belongs to the conic). Thus the surface 
M gets partitioned into two parts depending on the number, 2 or 0, of common 
tangent lines of M and Q, passing through a fixed point on M. The geodesic 7 is 
confined to the former part and can have only one of the two possible directions in 
every point, namely, the directions of the common tangent lines of M and Q; see 
Figure 28.18. 

In conclusion, two remarks. First, most of the results concerning billiards 
inside the ellipsoid and geodesies on the ellipsoid have multi-dimensional analogs; 
for example, the tangent lines to a geodesic on an ellipsoid in n- dimensional space 
are tangent to (n — 2) other fixed quadratic confocal hypersurfaces. 

Secondly, if the smallest semi-axis of an ellipsoid tends to zero, the ellipsoid 
degenerates to a doubly covered ellipse. The geodesic lines on the ellipsoid become 
the billiard trajectories inside this ellipse, and Theorem 28.9 implies Theorem 28.2 
as its limit case. 

For more information about billiards in general, and in particular, billiards 
inside the ellipsoids and the geodesies on the ellipsoids, see, e.g., [78, 83]. 
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Figure 28.18. A geodesic on an ellipsoid: the lines tangent to 
the geodesic are tangent to a confocal hyperboloid (transparent 
and opaque) 



John Smith 
January 23, 2010 



Martyn Green 
August 2, 1936 



Henry Williams 
June 6, 1944 
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John Smith Martyn Green Henry Williams 

January 23, 2010 August 2, 1936 June 6, 1944 

28.8 Exercises. 

28.1. Prove that, in terms of the angles in Figure 28.19, the area element (28.2) 
can be expressed as u) = sin a dadx. 




x 



Figure 28.19. The angles associated with a chord 

28.2. (a) Prove that the ellipse, hyperbola and parabola, given by the "gardener 
construction" of Section 28.2, indeed have familiar equations of second degree. 

(b) Deduce the formula for a confocal family of conies (28.3). 

28.3. Prove the optical property of the parabola. 

28.4. Prove that a billiard trajectory in an ellipse that starts at a focus tends 
to the major axis of the ellipse. 

28.5. (a) Consider a disc with center O and let A be a point inside the disc. 
For every point X of the circle fold the disc so that the point X coincides with 
point A. Prove that the envelope of the fold lines is the ellipse with the foci A and 
O. What happens if A lies outside of the disc? 

(b)* Given a smooth curve 7 and a point A, reflect the lines emanating from 
A in 7. Let W be the locus of points obtained from A by reflection in the tangent 
lines to 7. Prove that IF is a curve orthogonal to the reflected lines. 

Hint. Approximate 7 by an ellipse and use (a). 

28.6. According to the optical property of the ellipse, rays emanating from 
a point source of light L located at a focus of the elliptic mirror will pass, after 
reflection, through the other focus. However, if L is not located at a focus, then 
the reflected rays will not pass through one point; on the contrary, they will have 
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an envelope, once again called the caustic. Draw a picture which allows to visualize 
these caustics; consider three cases: L is close to the focus, L is not close to the 
focus but still insuidc the ellipse, L is outside of the ellipse. In the last case we 
need to assume that the ellipse is both transparent and reflecting. 

28.7. Geodesies emanating from a point of the sphere all arrive at the opposite 
point. This will not be true, however, if one replaces the sphere by an ellipsoid. 
Draw the family of geodesies emanating from a point of an ellipsoid (better take the 
ellipsoid close to a sphere) in the neighborhood of the opposite point. What does 
the envelope of this family look like? ( Warning: your picture should not contradict 
to the fact that any two points of the ellipsoid can be joined by a geodesic.) 

28.8. * Construct a trap for a parallel beam of light (by a trap we mean a non- 
closed curve such that if a family of rays having, say, the vertical direction enters 
the curve and starts to reflect in the curve according to the law of geometrical 
optics, then no ray will ever escape to infinity). 

Hint. Use the optical properties of the ellipse and the parabola. 

28.9. Find an elementary geometry proof of the "most elementary theorem of 
Euclidean geometry" . 

28.10. Show that the Cartesian coordinates are expressed in terms of the elliptic 
ones as follows: 



28.11. The apparent contour of an algebraic surface given by an equation of 
degree n is an algebraic plane curve given by an equation N. Prove this and find 
the relation between n and N. 



(a 2 + A 1 )(a 2 + A 2 ) 
b 2 -a 2 




(b 2 + Ai)(fr 2 + A 2 ) 
b 2 -a 2 




LECTURE 29 

The Poncelet Porism and Other Closure Theorems 

29.1 The closure theorem. Consider two nested ellipses, 7 and T, choose 
a point X on the outer one, draw a tangent line to the inner until it intersects 
the outer at point Y, repeat the construction, starting with Y, and so on. We 
obtain a polygonal line, inscribed into T and circumscribed about 7. Suppose that 
this process is periodic: the n-th point coincides with the initial one. Now start 
at a different point, say, X\. The Poncelet closure theorem states that again the 
polygonal line closes up after n steps, see Figure 29.1. 

Poncclct's porism 1 is a classical result of projective geometry. It was discovered 
by Jean- Victor Poncelet when he was a prisoner during the Napoleonic war in the 
Russian city of Saratov in 1813-1814, and published in 1822 in his "Traite sur les 
proprietes projectives des figures" . 

One can devise one's own closure theorem as follows. Start with a parametrized 
oval r(t) with t varying from to 1. Choose a constant c and consider the 1- 
parameter family of chords r(i)r(t + c). These chords have an envelope 7. This 
envelope may have cusps (but not inflection points, see Lecture 8); suppose it is 
smooth, which will always be the case if c is small enough. Then we obtain a pair 
of nested ovals, T and 7, for which the statement of the closure theorem holds. 

Indeed, the correspondence X Y is given, in the parameter t, by the formula 
t t + c. A point returns back after n iterations if and only if nc is an integer. 
This condition depends only on c, that is, on the pair of ovals, but not on the choice 
of the starting point X, whence the closure theorem. 

The question then is: given a pair of nested ellipses, how to choose a parameter 
t on the outer one so that the correspondence T : X Y is given by the formula 
t^t + c? 

29.2 Proof. First of all, stretch the plane so that T becomes a circle (a tech- 
nical name for stretching is an affine transformation). Since the Poncelet theorem 

1 For all practical purposes, the Greek word "porism" means "theorem". One of the lost 
books by Euclid was "Porisms". 
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Figure 29.1. The Poncclct closure theorem 



involves only lines (but not distances or angles), transformations of the plane that 
take lines to lines do not violate the theorem (such transformations are called pro- 
jective). We consider the circle T in its arc length parameter x. 




Figure 29.2. Left and right tangent segments 

Denote by i? 7 (x) and L 7 (x) the lengths of the right and left tangent segments 
from point x to the curve 7, see Figure 29.2. Consider a point x±, infinitesimally 
close to x. Let O be the intersection point of the lines xy and x\y\ and e the angle 
between these lines. The line x\y\, as every line, makes equal angles with the circle 
T; denote this angle by a, see Figure 29.3. 

What follows is, essentially, the argument from Theorem XXX, Figure 102, in 
I. Newton's "Principia" [55]; see also Lecture 30. 
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X 



r 



Figure 29.3. Distortion of the arc length 



By the Sine theorem, 



\yyi\ 
L i{y) 



sin a 



sin e 



XX\ 



or 



(29.1) 



dy dx 



L 7 (y) R 7 (x)' 



Assume, for the moment, that 7 is a circle too. Then the right and left tangent 
segments are equal: R 7 (x) = L 1 {x). Denote this common value by D 7 (x). It follows 
from (29.1) that the length element dx/D 1 (x) is invariant under the transformation 
T. It remains to choose a parameter t so that this length element is dt; this is done 
by integrating: 



and the transformation T becomes a translation t 1— > t + c. 

Finally, if 7 is not a circle, let A be a stretching of the plane that takes 7 to 
a circle. An affine transformation does not change the ratio of parallel segments. 
Taking (29.1) into account, we have: 



One again obtains a length element dx/DA-y(Ax), invariant under the transforma- 
tion T. As before, we choose a parameter t so that T(t) = t + c, and Poncelet's 
theorem follows. □ 

Remark 29.1. One can deduce the Poncelet porism from the complete inte- 
grability of the billiard inside an ellipse, see Corollary 28.5 which establishes the 
closure theorem for a pair of confocal ellipses - see Exercise 29.1. 

29.3 Ramifications. A conic is determined by 5 of its points. A pencil of 
conies is a 1-parameter family of conies that share 4 fixed points. These points 
may be complex, and then they are "invisible" in the real plane. Algebraically, let 
P{x 1 y) = and Q(x,y) = be quadratic equations of two conies. These conies 




dx Rj(x) Raj{Ax) Da 1 {Ax) 
~d~y ~ L 7 (y) ~ L Al (Ay) ~ D Al (Ay)' 
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determine the pencil given by the equation P{x,y) + tQ(x,y) = where t is a 
parameter. 2 

The algebraic definition of a pencil does not exclude the cases when some of the 
4 points coincide or lie at infinity. For example, the family of concentric circles is 
a pencil. Indeed, a conic is a circle if and only if it passes through two very special 
"circular" points at infinity, (1 : i : 0) and (—1 : i : 0). The concentric circles are 
all tangent to each other at these point, so in this case, the four points merge into 
two "double points". 

Consider a number of nested conies from the same pencil 71 , 72 , • • • , 7& , T where 
T is the outmost one. Let us modify the game: choose a point X on T, draw a 
tangent line to 71 to meet T again at Y; draw a tangent line from Y to 72, draw 
a tangent line to -fk to meet T at Z, sec Figure 29.4. The correspondence IhZ 
enjoys the same property: if its n-iteration takes a point X to itself then every other 
point returns back after n steps. This is a generalization of the Poncelet porism. 
(A different, projectively dual, generalization is given in Corollary 28.6.) 



A 6 




FIGURE 29.4. Generalized, or "Big", Poncelet theorem 

Given two nested ellipses 7 and T, one wants to determine whether the inscribed- 
circumscribed Poncelet polygon closes up after n steps. If both conies are circles 
(not necessarily concentric!) and n = 3,4, explicit answers were known before 
Poncelet discovered his theorem. See Exercise 29.2. 

A general answer was found by Cayley, see [38]. We shall describe this answer 
(without proof) in a particular case, when the outer conic is the unit circle x 2 +y 2 = 
1 and the inner one is a concentric ellipse a 2 x 2 + b 2 y 2 — 1. Consider the Taylor 
series 

^/(a 2 + t)(b 2 + t)(l + t) = c + cii + c 2 t 2 + ■■■ 



2 In Lecture 28, we called a pencil a "linear system of conies". 
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where each Cj is a function of a and b (for example, cq = ab). The Poncelet polygon 
closes up after n steps if and only if 



for n = 2m + 1 , and 



for n = 2m. 



det 



det 



C2 ' ' ' C m+ i 

' ' ' ^2m 

C3 ' ' ' Cm+1 

Cm+1 ' ' ' C2m 



= 



= 



29.4 The Poncelet grid. Let 7 and V be two nested ellipses. R. Schwartz 
recently discovered an interesting property of Poncelet polygons, that is, polygons 
inscribed into T and circumscribed about 7 [70] . 

Let Li,...,L n be the lines containing the sides of the polygon and listed in the 
cyclic order of their tangency points with 7. Consider the intersection set of these 
lines: Aij — Li n Lj. Let us assume that An is the tangency point of L, with 7. 
Assume also that n is odd (the formulation for even n is a little different). 

The points constitute a finite set that we call the Poncelet grid. Let us 
decompose this grid into subsets in two ways. For each j — 0, 1, . . . ,n — 1, the 
"circular" set Pj consists of the points A i>i+ j (of course, we understand the indices 
cyclically, so that n + 1 = 1, etc.), and the "radial" set Qj of the points Aj—ij+i. 
Note that Pj = P n -j, so there are (n + l)/2 circular sets Pj, each containing n 
points. There are n radial sets Qj, each containing (n + l)/2 points. All this is 
illustrated in Figure 29.5 where n = 7. 

According to the Schwartz theorem, each circular set Pj lies on an ellipse, 3 
say, 7 3 , and Pj consists of the vertices of a Poncelet polygon, inscribed into 7* and 
circumscribed about 7. Likewise, each radial set Qj lies on a hyperbola. Further- 
more, all the circular sets Pj are projectivcly equivalent: for all there exists 
a projective transformation that takes Pj to Pji . The same projective equivalence 
holds for the radial sets Qj. 

Analogous results hold for even n. R. Schwartz proved his theorem using com- 
plex algebraic geometry. One can also deduce the Schwartz theorem from properties 
of billiards in ellipses, see [52]. 

29.5 Money- Coutts theorem. There is a number of other closure theorems 
that resemble the Poncelet porism. One is Stcincr's theorem on a chain of circles, 
tangent to two given ones, 7 and T, sec Figure 29.6. The statement is that if such 
a chain closes up after n steps, starting from some point, then this will happen for 
any other starting point as well. Steiner's theorem becomes obvious if one applies 
an appropriate geometric transformation: there is an inversion that takes 7 and T 
into concentric circles. (Note however that there is no such proof of the Poncelet 
porism which is therefore a much harder result.) 

Another curious theorem concerns mutually tangent circles inscribed into poly- 
gons. Here is the simplest one. Consider a triangle AiA 2 A 3 . Inscribe a circle C\ 
into the angle A 3 AiA 2 , then the circle C 2 into the angle AiA 2 A 3 , tangent to C\, 



3 This statement was known to Darboux [19]. 
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Figure 29.5. Poncelet grid 




FIGURE 29.6. Steiner's theorem 

then the circle C3 into the angle A 2 A^Ai, tangent to C3, and so on cyclically, see 
Figure 29.7. 

Theorem 29.1. This sequence of circles is 6-periodic: C7 = C\, see Figure 
29.8. 
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Figure 29.8. Six-pcriodicity of the process 



Proof. The proof consists of a clever change of variables that appears somewhat 
of a miracle. 

Let the angles of the triangle be 2a\, 2a 2 and 203. Consider the first two circles; 
let T\ and r 2 be their radii. We claim that 

(29.2) n cot ai + 2^/¥^ + r 2 cot a 2 = |AiA 2 |. 

Indeed, in Figure 29.9, 

|AiPi| = ncotai, IA2P2I = T"2C0ta2 and I-P1P2I = 2^Jr 1 r 2 . 

Set: ricotai = u\, r 2 cota 2 = u 2 and Vtan ai tan a 2 = e. Then equation 

(29.2) can be rewritten as 

(29.3) u\ + 2eu 1 u 2 + u 2 2 = \A 1 A 2 \. 
Equation (29.3) implies: 

(29.4) ui + eu 2 = ^/|A 1 A 2 |-(l-e 2 ) U 2 j ?i2 + eUl = \A X A 2 \ - {I - e 2 )u\. 
Rewrite (29.3) as 

1*1(1*1 + eu 2 ) + u 2 (u 2 + eui) = \A X A 2 \ 
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A 3 




Figure 29.9. Relation between two inscribed circles 
or, in view of (29.4), as 

(29.5) u iy /\Z^\ - (1 - e>2 + U2y /\ Al A 2 \ - (1 - e*)u\ = \A±A 2 \. 

Inscribe a circle into our triangle; let r be its radius and 01,02,03 the tangent 
segments from the vertices to the circle, see Figure 29.10. Let p = ai + a 2 + a 3 be 
the semi-perimeter and S the area. On the one hand, S = rp, and on the other, 
S = ^paia 2 a 3 by the Heron formula. Therefore r 2 = a\a 2 ai,lv- 




Figure 29.10. Proving 6-periodicity 



Let us find e. One has: tanai = r/ai, tana 2 = r/a 2 , hence 



e = 



as 



a Y a 2 p 

In particular, e < 1. It follows also that 

2 p-a 3 \AxA 2 \ 



P 



P 
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We now rewrite equation (29.5) as 




(29.6) 

p \/v v p V p 

A brief glance on this equation reveals the final substitution: 

— = sin 0i, — = sin0 2 , 

Vp Vp 

and (29.6) is finally rewritten as 

sin (0i + 02 ) 

or 



\A X A 2 



(29.7) 1 + 2 =sin -i^l^Mj :=fk 

(the last equality is just a convenient notation). 

Now, let us pause and reflect on what we have achieved. Initially, we charac- 
terized the circles inscribed into the two angles of the triangle by their radii, r\ 
and r 2 , and the tangency condition was a complicated equation (29.2). Then we 
changes the variables to 0i and 02, and the tangency condition simplified to (29.7). 

Introduce the third class of circles, the ones inscribed into the angle A 2 Ay,Ai; 
they are characterized by their radius r 3 and by the variable 3 , related to r 3 the 
same way as 0i and 02 to r\ and r 2 . 

Now, the final touch. For the first seven circles, we have: 

01 + 02 = 03, 4>2 + 4>3 = Pi, 4>3 + 4>4 = P2, <t>4 + 4>5 = 01,<t>5 + 4>d = 06 + 4>7 = Pi- 

Take the first equation, subtract the second, add the third, subtract the forth, etc. 
The result is: 0! — 7 — 0, that is, the 7-th circle coincides with the first one. □ 

Theorem 29.1 is closely related to the Malfatti problem of elementary geometry 
that asks to construct three pairwise tangent circles, inscribed into the three angles 
of a triangle. The problem was solved by G. Malfatti in 1803 but it continued to 
attract interest of prominent geometers of 19th century such as Stcincr, Pliickcr 
and Cayley. A solution to the Malfatti problem readily follows from our formulas: 
for Malfatti circles, 04 = 0i, 05 = 02, 06 = 03, and 

, P 3 + §2 ~ Pi . Pl+jh-Jh , P2 + Pi- P3 

<Pi = g ' ^ 2 = 2 ' ^ 3 = 2 ' 

which uniquely describes the circles. 

An amazing fact is that the statement of Theorem 29.1 still holds if one replaces 
a triangle with straight sides by one made of circular arcs! This was discovered by 
amateur mathematicians G. B. Money-Coutts and C. J. Evelyn by careful drawing, 4 
and proved by Tyrrell and Powell in 1971, sec [86]. The proof resembled our 
proof of Theorem 29. 1 but the role of trigonometric functions was played by (more 
complicated) elliptic functions. (Incidentally, the invariant function for the billiard 
ball map in an ellipse can be expressed in elliptic functions as well.) 



4 The personal-computer era had still to arrive. 
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Figure 29.11. No periodicity for a generic polygon 



What about other polygons? The game is to inscribe circles into consecutive 
angles, each circle tangent to the previous one. Figure 29.11 shows a generic pen- 
tagon and hexagon: we see that the inscribed circles do not exhibit any periodicity. 
For a generic quadrilateral, the behavior of the circles is quite chaotic [84] . 

However, periodicity is redeemed if the n-gon satisfies a special condition, de- 
picted in Figure 29.12. Assume that n > 5. Let the vertices of a polygon be 
Ai,A 2 , • • • , and the the interior angles be 2a l7 2a 2 , • • • . Assume that on + a i+x > 
ir/2 for all i. Denote by Dj the intersection point of the lines Ai-\Ai and Ai + \Ai + 2- 
Consider the excircles of the triangles j4.j_iAj.Dj_i and AiAi + iDi, tangent to the 
sides AjDj_i and AiDi, respectively. The condition is that these two excircles 
coincide for all i. 




Figure 29.12. Condition that redeems periodicity 
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Then, for odd n, the sequence of inscribed circles is 2n-periodic. For even n, 
one needs one additional condition: 



n "=i, i odd (V 1 - cot Qj cot Qi+i + 1) 

n "=l, i even (V 1 - Cot Cot + 1) 



1. 



and if this condition holds then the sequence of inscribe circles is n-periodic. 

This result is proved similarly to Theorem 29.1, and like the latter, also has a 
version for polygons made of arcs of circles, see [81]. For more about the Poncelet 
porism, its history and generalizations, see [5, 10]. 
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29.6 Exercises. 

29.1. Show that an arbitrary pair of nested ellipses can be taken to confocal 
ones by a projective transformation. Deduce the Poncelet porism from Corollary 
28.5. 

29.2. Let r and 7 be circles of radii R and r, and let a be the distance between 
their centers. Assume that 7 lies inside V. 

(a) Prove that there exists a triangle, inscribed into T and circumscribed about 
7, if and only if a 2 — R 2 — 2rR (Chappie's formula). 

(b) Prove that there exists a quadrilateral, inscribed into T and circumscribed 
about 7, if and only if (R 2 - a 2 ) 2 = 2r 2 (R 2 + a 2 ) (Fuss's formula). 

29.3. (a) Prove Stciner's theorem illustrated in Figure 29.6. 

(b) Show that the centers of the circles, tangent to V and 7, lie on an ellipse 
whose foci are the centers of T and 7. 

29.4. * Given a generic triple of lines £^,£2, £3, how many triples of circles 
Ci, C2, C3 are there such that every two circles are externally tangent to each other, 
and C\ is tangent to £ 2 and £ 3 , C 2 to £ 3 and £ 1; and C 3 to l\ and l-p- 
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29.5. The original Malfatti problem asked to inscribe three non-overlapping 
circles into a given triangle such that the sum of their areas is greatest possible. 
Malfatti assumed that this maximum is attained when each circle touches the other 
two. Prove that this assumption is wrong. 

Hint. Consider an equilateral triangle. 

Comment. A complete solution to this extremum problem was published as 
late as 1992 [92]. 




LECTURE 30 

Gravitational Attraction of Ellipsoids 

In this last lecture we use freely physical terminology, assuming from the reader 
some basic knowledge of physics along with common sense. For example, a func- 
tion is called a potential for a field of forces if the force vector is the gradient of 
this function; an equipotential surface is a level surface of a potential function, etc. 
Needless to say, the gravitational attraction is proportional to the masses and in- 
verse proportional to the squared distance between the bodies, and likewise for the 
Coulomb attraction/repulsion of electrical charges. 

30.1 No gravity in a cavity. I. Newton, one of the creators of mathematical 
analysis, was a great master of geometrical arguments. His main book, Philosophiae 
Naturalis Principia Mathematica (Mathematical Principles of Natural Philosophy) 
[55] , is full of geometrical figures and is almost entirely devoid of formulas. Theorem 
30 (Proposition 70) of Section 12 "The attractive forces of spherical bodies" from 
Principia states: 

"If toward each of the separate points of a spherical surface there tend equal 
centripetal forces decreasing as the squares of the distances from the point, I say 
that a corpuscle placed inside the surface will not be attracted by these forces in 
any direction." 

In other words, there is no gravity inside a uniform sphere (or rather, infinites- 
imally thin spherical shell). 

Here is (slightly modified) Newton's proof (cf. Lecture 29). Let P be a point 
inside the sphere. Consider an infinitesimal cone with vertex P. The intersection of 
the sphere and the cone consists of two infinitesimal domains A and B, sec Figure 
30.1. Let us show that the forces of gravitational attraction exerted at P by these 
two domains cancel each other. 

The attraction forces from P to the domains A and B are proportional to 
their masses, that is, their areas, and inverse proportional to the squares of their 
distances to P. The axis of the cone makes equal angles with the sphere. Therefore 
the two infinitesimal cones with the common vertex at P are similar, and the ratios 
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Figure 30.1. The attraction forces at point P cancel each other 

of the areas of their bases to the squares of the distances to P are equal. Hence 
the attraction forces at P are equal and have opposite directions. To quote Newton 
once again: 

"Accordingly, body P is not impelled by these attractions in any direction. 
Q.E.D." 

Two remarks arc in order. First, the same argument proves that there is no 
gravitational force inside a spherical shell of any width: it is made of infinitely thin 
shells, and for each of them the gravitational force vanishes. Secondly, the electric 
force inside a uniformly charged sphere vanishes as well: the Coulomb force is also 
subject to the inverse square law. 

30.2 Attraction outside a sphere. Next, Newton considers the attraction 
force outside a homogeneous sphere. Theorem 31 (Proposition 71) states: 

"With the same conditions being supposed as in prop. 70, I say that a corpuscle 
placed outside the spherical surface is attracted to the center of the sphere by a 
force inversely proportional to the square of its distance from the same center " . 

That is, a uniform sphere attracts as a mass-point of equal total mass placed 
in the center. 

Newton's proof is again geometrical, but rather involved; we shall give a differ- 
ent, more transparent and more conceptual, argument. 

Consider the motion of non-compressible fluid toward a sink located at the 
origin O. The flow lines are radial, and the flux through any sphere, centered 
at O, is the same. The surface area of a sphere of radius r is Anr 2 . Hence the 
speed of fluid, as a function of the distance from the origin, is proportional to 
r~ 2 . Conclusion: the velocity field of spherically symmetric non-compressible fluid 
toward a sink is the same as the gravitational force field of a mass point. 

The gravitational force field of any mass distribution is the sum of the forces 
exerted by the individual masses. It follows that the gravitational force field of any 
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mass distribution has the same non-compressible property: the flux through the 
boundary of any domain that does not contain masses is zero. 1 

Back to the gravitational attraction of a uniform sphere centered at point O. 
Due to the spherical symmetry, the force at a test point P depends only on the dis- 
tance PO and has the radial direction. Moreover, the force field is non-compressible. 
The only non-compressible spherically symmetric radial field is the field of gravita- 
tional attraction of a mass point located at O. To see that the mass of this point 
equals the total mass of the sphere, it suffices to compare the fluxes of the two fields 
through a sufficiently large sphere centered at O. 

Both results, Newton's Theorems 30 and 31, hold in space of any dimension n, 
provided the attraction force of points at distance r is proportional to r 1_ ™. 

30.3 Free distribution of charge. Drop charged liquid on a closed conduct- 
ing surface, and the charge will freely distribute on the surface. For example, for a 
sphere, this free distribution of charge is uniform. 

Free distribution of charge has two properties. First, the potential on the 
surface itself is constant. This is obvious: a difference of potential at two points 
would cause the charged liquid to move from one point to another. 

Secondly, the electric force inside the surface vanishes or, cquivalently, the 
potential is constant. Indeed, assume that the potential is not constant. Being 
constant on the boundary surface, it attains either minimum or maximum at an 
interior point, say, P. Consider a small cquipotcntial surface surrounding P. Then 
the electric force field enters (or exists) this surface, in contradiction to the fact 
that the force field is non-compressible. 

This argument provides an alternative proof of Newton's theorem that the 
gravitational force inside a uniform sphere vanishes. Of course, we rely on a physical 
common sense here (for example, the existence of a unique free distribution of charge 
is a mathematically complicated matter!) 

30.4 Homeoids. A homeoid is a domain between two homothetic ellipsoids 
with a common center. 

Theorem 30.1. The gravitational force inside an infinitely thin homeoid is 
zero. 

Proof. First, consider an infinitesimally thin spherical shell. We know from 
Section 30.1 that the attraction force at a test point P is zero, see Figure 30.2. 
Denote by v and V the volumes obtained by intersecting an infinitesimal cone with 
vertex P and the shell, and by r and R the distances from P to the sphere along 
the axis of the cone. Since the attraction force vanishes, v/r 2 = V/R 2 . 

A homeoid is obtained from a spherical shell by an affinc transformation, a 
stretching in three pairwisc orthogonal directions with different coefficients, see 
Figure 30.2. Denote the respective volumes and distances by v',V',r',R'. An 
affinc transformation preserves the ratio of volumes and the ratio of collinear seg- 
ments: V'/v' = V/v, R'/r' = R/r. It follows that v' /{r') 2 = V'/{R') 2 , that is, the 
attraction forces at point P' cancel each other. □ 

Thus the potential inside an infinitely thin (and hence, a finitely thin) homeoid 
is constant. Consider the distribution of charge on an ellipsoid whose density is 



In other words, the field is divergence free. 
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Figure 30.2. The attraction force inside a homeoid is zero 

proportional to the width of the infinitely thin homeoid. For this distribution, the 
potential inside and on the ellipsoid is constant. Hence this is the free distribution 
of charge. 

30.5 Arnold's theorem. Consider a smooth closed surface M given by a 
polynomial /(x, y, z) = of degree n. For example, 

ax 4 + by 4 + cz 4 = 1 

is the equation of a surface of degree 4. A point P is called interior with respect to 
the surface M if every line through P intersects M exactly n times (of course, the 
number of intersections cannot exceed n). Figure 30.3 depicts two curves of degree 
4; the interior points of the first one lie inside the inner-most oval, and the second 
curve has no interior points at all. 




Figure 30.3. Two curves of degree four: one has interior points 
and the other does not 



Consider the distribution of charge on surface M, whose density is proportional 
to the width of the infinitesimal shell between M and the surface M £ — {f(x,y,z) = 
e}. This is a generalization of the homeoid density discussed in Section 30.4. Let 
P be an interior point. The sign of the charge alternate: it is positive on the 
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component of M, closest to P, negative on the next component, positive on the 
next, etc. 

Theorem 30.2. The attraction force exerted by M at point P is equal to zero. 

Proof. As before, consider an infinitesimal cone with vertex at P and axis £. 
The intersection of the cone with the shell between M and M e consists of n domains, 
and we shall prove that their attractions at point P cancel out. 

Consider one of these domains, and let Q be its point. Let h be the length of 
the (infinitesimal) segment of I inside the domain, and set r = PQ, see Figure 30.4. 
The volume of the domain equals h times the area of the orthogonal section of the 
cone at point Q. The latter area is proportional to r 2 . Therefore the domain exerts 
an attraction force at P proportional to r 2 h/r 2 = h. Thus we need to prove that 
the sum of signed lengths of the (infinitesimal) segments of the line t between M 
and M e equals zero. 

Af A*. 

i / / 
P^^Z^^ / / 



Figure 30.4. Computing the force exerted by domain Q at point P 

The latter is a one-dimensional statement: we can forget about the ambient 
space and restrict the polynomial / on the line I. What we get is a polynomial of 
one variable, f(x). 



q + h 



Figure 30.5. Computing the width of the infinitesimal shell 



Let us express h in terms of /. Consider Figure 30.5: f(q) — 0, f(q + h) = e. 
One has: f(q + h) = f(q) + hf'(q) (we ignore terms of order h 2 and higher), and 
hence h = e/\f'(q)\. We need however to sum up with correct signs. We claim that, 
the signs taken into account, what we need to prove is the identity: 

(30.1) 77rT + 77rT + ---+77^ = ° 

f'(qi) f'{Q2) f'{q n ) 

where the sum is taken over all roots of a polynomial f(x) of degree n. 
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FIGURE 30.6. Sign bookkeeping 



Indeed, the signs of derivatives at consecutive roots alternate, see Figure 30.6, 
and so do the signs of the charges. Let q\ < ■ ■ ■ < q k be the roots to the 
left of point P, and qu+i < ■ •• < q n to the right of it. To fix ideas, assume 
that /'(gfc+i) > 0. Then the total (positive) attraction force exerted by points 
q k+ i, ■■ ■ ,q n is l/.f (qk+i) H h l/f'(q n )- Likewise, f'(q k ) < 0. The total (nega- 
tive) attraction force exerted by points qi, ■ ■ ■ , q k is l//'(<7i) + - ■ ■+/' (<7k)- Summing 
up the two yields (30.1). 

Let us prove identity (30.1). Recall that all roots of / are real: f(x) = (x — 
qi) ■ ■ ■ (x — q n ). It follows that 

f(x) = (x-q 2 )---(x- q n ) + (x - gi)(a; - q 3 ) ■ ■ ■ (x - q n ) H 

+ (x - qi)(x -q 2 )--- (x - q n -i), 

and hence /'(%) = - qi){q t - q 2 ) ■ ■ ■ (qi - q n ) (of course, the term % - qt is 
omitted). Then (30.1) is equivalent to the identity 

1 1 

{Qi - Q2){qi - 93) • • • (qi - Qn) (<?2 - <?i)(<?2 - 93) h (q 2 - q n ) 

(30.2) +- = 0. 

{Qn ~ Ql) ■ ■ ■ (in - Qn-1) 

It remains to prove (30.2). Consider the polynomial of degree n — 1: 

/ n = (^ - 92) - g3) ■ ■ ■ (^ - gn) (x - gi)(x - g 3 ) ■ ■ ■ (x - q n ) ^ 

(qi - Q2)(qi - 93) • • • (?i - Qn) {Q2 - qi)(q 2 - ft) h (92 - g„) 

(x - gi )(x - gg) ■ ■ ■ (X - gn-l) 
{Qn - Ql)'-- (Qn ~ Qn-l) 

One has: g(qi) — g(q 2 ) = • • • = g(q n ) = 1. If a polynomial of degree n — 1 has n 
roots then this polynomial is identically zero. Therefore g(x) = 1. In particular, 
the leading term of g(x) is equal to zero, and this is precisely identity (30.2). □ 



30.6 Attraction outside a homeoid: Ivory's theorem. We proved that 
the attraction force inside a homogeneous homeoid equals zero. What about its 
exterior? The answer was found by James Ivory early in the 19-th century. 

An ellipsoid M 

x 2 y 2 z 2 
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includes into a 1-parameter family of quadratic surfaces M\ 



y 



= 1. 



a 2 + A b 2 + A c 2 + A 
called confocal quadrics, see Lecture 28. Depending on the value of the parameter 
A, this can be an ellipsoid, a hyperboloid of one sheet or a hyperboloid of two sheets. 

Theorem 30.3. The equivipotential surfaces of the free distribution of charge 
on an ellipsoid are the confocal ellipsoids. 

Proof. The proof is based on a lemma due to Ivory. Consider two confocal 
ellipsoids, M and M\. The latter is obtained from the former by an affine trans- 
formation stretching along the three coordinate axis: 



a' b' d 
A : (x, y, z) ^ (X, Y, Z) = ( —x, -y, -z 

a b c 



(30.3) 
where 

a' = vV + A, b' = yjb 2 + A, c' = Vc 2 + A. 
We shall refer to (x, y, z) and (X, Y, Z) as the corresponding points. 

Lemma 30.1. Let P,Q be two points on an ellipsoid M and P',Q' the corre- 
sponding points on M\. Then \PQ'\ = \P'Q\ (see Figure 30.7). 




Figure 30.7. Ivory's lemma 

Proof of Lemma. The proof is computational. Let P = (x,y, z),Q = (u, v, w) 
and P' = (X, y, Z),Q' = (U, V, W). Then 

\PQ'\ 2 = \P\ 2 + |0'| 2 - 2P ■ Q', \P'Q\ 2 = |P'| 2 + |Q| 2 - IP' ■ Q. 

Note that P ■ Q' = P' ■ Q, as readily follows from (30.3). What remains to prove is 

\P'\ 2 -\P\ 2 = \Q'\ 2 -\Q\ 2 . 

Indeed, the left-hand side equals 



b' 



A 



y_ 

b 2 



A, 



and likewise for the right-hand side. □ 



Now we can complete the proof of Ivory's Theorem. Consider two infinitesi- 
mally thin confocal homeoids of equal volumes, M and M\. Let P and P 1 be a 
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pair of corresponding points on them. Then the potential exerted by Mq at point 
P' equals the potential exerted by M\ at point P. 

Indeed, let Q and Q' be a pair of corresponding points. Consider an infinitesi- 
mal volume at Q and an equal volume at Q'. By Lemma 30.1, the contribution of 
the former volume to the potential at point P' equal the contribution of the latter 
volume to the potential at point P. Since this holds for all pairs of corresponding 
points Q,Q', the statement follows. 

Thus the potential exerted by M at any point P' of M\ equals the potential 
exerted by M\ at the corresponding point P of Mq. But the potential inside a 
uniform homeoid M\ is constant. Therefore the potential exerted by M at every 
point of M\ is the same. □ 

Let us mention, in conclusion, that Newton's and Ivory's theorems on gravita- 
tional attractions of quadratic surfaces have elegant magnetic analogs, discovered 
by V. Arnold, see [4]. Consider a conducting hyperboloid of one sheet and as- 
sume that there is a voltage drop between its ends at infinity. This voltage drop 
induces an electric current along the meridians of the hyperboloid. The claim is 
that the magnetic field of this current inside the hyperboloid is zero, and outside 
the hyperboloid the magnetic field is directed along its parallels. 



John Smith 
January 23, 2010 



Martyn Green 
August 2, 1936 



Henry Williams 
June 6, 1944 



30.7 Exercises. 



30.1. Let Xi, . . . , x n+ i and a±, . . . , a n+ i be two sets of distinct real numbers. 
Construct a polynomial of degree not greater than n that takes value at at point 
xi. How many such polynomials are there? 

30.2. Let f(x) be a polynomial of degree n with n distinct real roots qi, . . . , q n . 
Prove that 



q\ 



for k = 0, 1, . . . , n — 2, and 



fill) 



f'(Qi) 



+ ••• + 



In 



f'(Qn) 







f'(qn) 



= 1. 



30.3. Prove that the conclusion of Theorem 30.2 still holds if the density of the 
charge is multiplied by an arbitrary polynomial (f)(x, y, z) of degree n — 2 or less. 

Comment: a generalization, due to A. Givental [36], states that if the density 
of the charge is multiplied by a polynomial of degree m then the potential at the 
interior points is given by a polynomial of degree not greater than m + 2 — n. 
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30.4. A harmonic function /(x, y) is a function satisfying the equality 

2 f d 2 f 
— - H = 

dx 2 dy 2 

(for example, x 2 — y 2 or In (x 2 + y 2 )). 

(a) Prove that a harmonic function has neither local minima nor maxima. 

(b) Prove that a level curve of a smooth harmonic function cannot be a simple 
closed curve. 

(c) Prove that the following polynomials of degree n are harmonic: 

P n {x)= £ (-l)^QxV- fe 

k=n(mod2) ^ ' 

and 

Qn(x)= E ("1)^(3^. 

fc+l=n(mo<i2) ^ ' 

These polynomials are the real and the imaginary parts of (x + iy) n . 

(d) Prove that every homogeneous harmonic polynomial of degree n has the 
form aP n (x) + bQ n (x) where a and b are real numbers. 

30.5. Prove that any finite configuration of positive and negative charges cannot 
be stable. 

Hint. The Coulomb potential is a harmonic function. 

30.6. A polynomial equation p(x, y) = determines an algebraic curve; an oval 
of the algebraic curve is a simple closed smooth curve satisfying this equation (see 
Lecture 10 for ovals of cubic curves). 

(a) We saw in Lecture 17 that a curve of degree 4 may have four ovals. Prove 
that it cannot have five ovals. 

Hint. A curve of degree 4 has at most 8 intersections with a generic conic. 

(b) Show that at most two ovals of a curve of degree 4 or 5 are nested, and 
that if there are two nested ovals then there are no other ovals. 

Hint. Consider the intersection with a line. 

(c) What is the greatest number of pairwise nested ovals that a curve of degree 
n may have? 

Comment. The greatest number of components of an algebraic curve of degree 
n in the projective plane is (n 2 — 3n + 4)/2 (Harnack's theorem). Hilbert's Sixteen 
problem asks to classify possible mutual positions of the ovals of algebraic curves. 

30.7. Given two confocal ellipses, let A be an affine transformation as in (30.3) 
that takes the first to the second. Prove that any point P and its image A(P) lie 
on a hyperbola, confocal with the ellipses. 

30.8. Given a convex polyhedron, consider the vectors orthogonal to its faces 
whose magnitudes are equal to the areas of the faces. Prove that the sum of these 
vectors is zero. 

Hint. Under orthogonal projection from one plane to another, the area is 
multiplied by the cosine of the angle between the planes. 

Comment. The claim is physically obvious: the vectors in question are the 
pressures exerted by the air inside the polyhedron on its faces. 
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LECTURE 1 

1.4. If r = 0, then 

a = [a ; a\, . . . , a<i_i,a] 

which can be rewritten as 

q= ^ + ^ , Ca 2 + (.D- i4)o-B = 0. 

To consider the general case, remark that if 7 is a root of quadratic equation 

Kx 2 + Lx + M = 0, then a H — is a root of a quadratic equation Mx 2 + (L — 

7 

2aM)x + {Ma 2 — La + K) = 0. If a is a periodic continued fraction with r ^ 0, 
then 

a= [a ;ai,...,a r _i,/?] 

where is a periodic continued fraction with r = 0. Hence, /3 is a root of a 

quadratic equation with integer coefficients, and pV-i = °r-i + -5, pV-2 = a r-2 + 

P 

1 1 1 

— , ■ ■■,pi — ai + —,a = aQ + — are all roots of quadratic equations with 

pV-1 P2 Pi 

integer coefficients. 

1.5. See [56], Chapter 4. 
1.6. 

V3 = [1; 1,2,1,2,1,2,...] 
= [2;4,4,4,4,...] 
V" 2 + 1 = [n;2n,2n,2n, 2n, ...] 

Vn 2 - 1 = [n-l;l,2(n-l),l,2(n-l),l,2(n-l),l,...] 
1.9. First notice that for any real 7 =^ and any integer a, the numbers 7 and 
a H — are related: 

a+ l = «7+l ^o.l^.l 
7 7 + 

If the continued fractions a and (3 are almost identical, then 

a= [a ;ai,...,a m _i,7], /3 = [6 ; &i, • • • , & n -i, 7]- 

Hence 7 is related to a m _i = a m _iH — , a m -2 = a m -2H , cti = OiH , a = 

7 a TO _i a 2 

— and also to /3„_i = &„_i + -, pV 2 = 6„_ 2 + -5 , • • • , p\ = &i + -5-, /? = 

ai 7 Pn-l P2 

6 + — . Hence, a and f3 are related. (We constantly use implicitly Exercise 1.8.) 
Pi 
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1.10. Lemma. If a and [3 are related, then (3 can be obtained from a by a 
sequence of operations 7 1— » —7, 7 1— > 7 + 1, 7 1— > — . 

6 7 

Proof of Lemma. Let a = -, ad — be = ±1. Changing, if necessary, 

col + a 

the sign of a, we can assume that a > 0, c > 0. Then we follow the Euclidean 
algorithm for (a, c): if a = pc + q, < q < c, we change a into a — p (that is, 

5/3 + r 

a 1 ► —a 1 ► —a + ln . . . 1— > —a IpH a — p, and get — ; where r = b — pd; 

cp + d 

cf3 + d 

obviously, qd — rc = ad — be = ±1. We turn this into — , and so on. We 

qp + r 

k/3 + £ 

arrive at with km — • I = ±1 which implies k = ±1, m = ±1. Thus, a 

• p + m 

is reduced by our operations to ±f3 ± £, which, obviously, can be reduced by our 
operations to [3. □ 

To finish the proof of the statement of Exercise 1.10, we need only to observe 

that the continued fraction for each of —a, a + 1, — is almost equal to that of a 

a 

(see Exercise 1.2 for —a). 

1.11. Let a = [a ;ai,a 2 , . . .] be not related to the golden ratio. This means 
that there are infinitely many a n > 2. 

Case 1: infinitely many a„ > 3. Then the indicators of quality of corresponding 
convergent are a n + ■ ■ ■ > 3 > \/8. 
Let a n < 2 for n large. 

Case 2: infinitely many a n = 1. Then there are infinitely many n with a n = 
2,» n +i = 1. The indicators of quality of the corresponding convergents are 2 + 

1 1 n 1 1 17 777 

>2+- + - = — >V8. 



. + .'236 
1+ '• a n _i + "• 
There remains 

Case 3: a„ = 2 for n large. Then the limit of the indicators of quality of the 

convergents is [2; 2, 2, 2, . . . ] + [0; 2, 2, 2, . . .] = 1 + V2H 1 —= = V2+l + (\/2-l) = 

1 + v 2 

2V2 = Vs. 

1.12. We use the notation of the previous solution. 

Case 1: infinitely many a„ > 3. The indicators of quality of the corresponding 
[221 

convcrgcnts arc > 3 > \ . 

V 25 

Since a is not related to the golden ratio and to y/2, we can assume that, for n 
large, a n < 2 and there are infinitely many fragments {1,2} in the sequence {a n }. 

Case 2: there are infinitely many fragments {1, 2, 1} in the sequence {o„}. The 
indicators of quality of the corresponding convergents will be greater than 



1 1 /221 
2+ + >3> 



1+ •• 1 + 



25 



Let there be finitely many fragments {1, 2, 1}. 

Case 3: there arc infinitely many fragments {2, 1, 2}. Then there are infinitely 
many fragments {2,2,1,2,2}. Hence, infinitely many convergents have indicators 
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of quality 

1 1 n 1 1 n 1 7 [221 

2+ — + ~ 1 >2+ 3 + — ^^ 2+ 3 + T0 >3> V^ 

2+ ■• 1 + 1 — 1 + y 

2+ 2+- 

Let there be also finitely many fragments {2, 1,2}. 

Case 4: there are infinitely many fragments {2,2,2}. Then there are infinitely 
many fragments {a, 1, 1, 2, 2, 2, 2, b} with a < 2, b < 2. Hence, there are infinitely 
many convergents with the indicators of quality of the form 

1 1 n 1 1 
2+ j + j >2+ j — + j — 

1+ = 2+ = 1 + 



1 — 2+—^— l+l 2+ 1 



a+'-. fe + 



3 3 



4 7 /221 
2+ 7 + 17 > V^ 



Let there be also finitely many fragments {2, 2, 2}. 

Case 5: there are infinitely many fragments {1, 1, 1}. Then there are infinitely 
many fragments {1,1, 1,2, 2, 1,1} and hence infinitely many convergents with the 
indicators of quality greater than 

1 1 3 3 [221 

2+ 1 — + 1 — = 2+- + ->\/ — 



1+ 2 1+ 2 



25 



9 + V221 

In this case, a is related to [2; 2, 1, 1, 2, 2, 1, 1, . . . ] = — . 

There remains 

Case 6: the continued fraction is periodic with the period {1,1,2,2}. In this 
case, the indicators of quality of the convergents corresponding to the incomplete 
fractions 1 are too low to consider, the indicators of quality of the convergents 
corresponding to the incomplete fractions 2 have the limit equal to 

[2;2,1,1, 2,2,1,1,...] +[0;1, 1,2,2,1, 1,2,2,...] 
_9 + V22T -9 + V22T 

_ io + io " 

1.13. For the numbers provided in the hint, the limits of indicators of quality of 

the convergents corresponding to the incomplete fractions 2 is 2 + [0; 1, 1,1,1,...] + 

VE- 1 3- V5 
[0; 2, 1, 1, 1, . . . ] = 2 + + = 3. 

1.15. The largest number with all incomplete fractions < n is, obviously, 



r 1 1 1 , n + Vn 2 + An 
[n;l,n,l,n, l,n, . ..J = . 

Hence, the maximal possible limit of indicators of quality of an infinite sequence 
of convergents is [n; l,n, l,n, . . .] + [0; 1, n, 1, n, 1, n, . . .] — \Jn 2 + An. Thus, A„ = 
\/n 2 + An. 
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2.3. let mgg, mgs, mi, mo an d ngg, rigs, . . . , ni, no be the digits of m and 
n in the numerical system with the base 2. According to the Kummer Theorem, 

is even if and only if there is at least one carry-over in the addition 777 + 77, 



that is, if at least one of the pairs (m^n^) is (1, 1). Thus, there are 3 possibilities 
for each pair (mj,rij), which shows that the total amount of pairs (m, n) with 



< m < 2 i0U , < n < 2 



100 



with 



777 + 77 
77 



3IOO 
4100 



odd is 3 100 , which is 
3.21 • HT 13 



of all numbers 



m + n 



77. 



2.4. We use the notations m^Ti^ from the previous solution. According to the 

is not divisible by 4 if and only if there is at most 



Kummer Theorem, 

one carry-over in the addition 777 + 77. In other words, there should be either no 
pairs (1, 1) at all, and we already know that there are 3 100 such pairs (m,n), or, 
the only such pair is (mgg, 7799), or, for some k < 98, one must have (m^+i, rik+i) = 
(0, 0), (mfc, ?7fc) = (1,1) with all other (m„ n^) different from (1, 1). Thus the total 

/ jyi + fi \ 

amount of numbers ) not divisible by 4 is 3 100 - 

V n J 

37-3", which is 



-99-3 98 = 111 • 3 98 = 



37_3_ 

4100 



09 



3.96-10 



-12 



of all numbers 
2.6. (a) 



777 + 77 

77. 



and only if 



277 

2 

77(77 — 1) 



277(277 - 1) 

2 

is even. 



77 = 2n(77 — 1) which is divisible by 8 if 



2n(277 - l)(2n - 2)(2n - 3) 77(77 - 1) _ 2t7 2 (t7 - l)(n - 2) 
24 2 ~ 3 ' 

This is not divisible by 8 if and only if 77 is odd and 71 — 1 is not divisible by 4, that 
is, 77 = 3 mod 4. 



3n 
3 



2.7. (a) 

3r7(3n~ 1)(3ti- 2) 



n ,, n „x „i 977 2 (77 — 1) 

- n = ^ [(3n - 1)(3t7 - 2) - 2] K - >-. 



This is not divisible by 27 if and only if neither 77, nor 71 — 1 is divisible by 3, that 
is, 77 = 2 mod 3. 
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(b) 



_ 3n(3n - l)(3n - 2)(3n - 3)(3n - 4)(3n - 5) n(n - 1) 
~~ 720 2 

= ~ [(3n - l)(3n - 2)(3n - 4)(3n - 5) - 40] 

80 

_ 9n 2 (n - l)(n - 2)(9n 2 - 18n + 13) 
~~ 80 
which is always divisible by 27. 

2.8. (a) This follows from the Newton formula 

lr r(r-l) 2 r(r-l)(r-2) , 
(1 + y) r = 1 + ry + 1 2 , V + -i ^ V + ... 

(which holds for any real r and any y with \y\ < 1). Put y = —Ax, r = — i. We get 
1 



VI -4a; 

1\ / 3\ / 1\ / 3\ / 5 



1 - \{-&) + V V 2 } V Mx) 2 + V 27V 3 f yV 2/ ( - J ' ' 1 ■ 
The term with (—4a;)" is 

1\ / 3\ / 2n- 1 



n! v / v / 2 «n! 



2-4 2n • 2"n! 2™n! • 2™n! \ n 

Thus, 

{ 2n 



E 



VT^4aT 
(b) It follows from Part (a) that 

(|(:)^)(|( 2 ;>-)^- + - + (-) 2 + (-) 3 + --- 

Equate the terms with x n : 



that is, V ( 2P ) ( 



p+q=n 

2p\ (2q 

q 



p 



= 4". 



p+q=n 

2.9. (a) One of many possible proofs: 

c _ (2n)! _ 1 (2n)! 



n!(n+l)! n(n + l) (n— l)\n\ \n n+1 
(2n)\ (2n)! /2n\ / 2n 



n!n! (n - l)!(n + 1)! \ ra / \ n + 1 
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(c) By (b), (xC(x))' = jr ( 2n )x n = (1 - 4x)~i. Hence, 



^ f,-, a x-i , 1 (l-4a;) 3 „ VI - 4a; 

iCfi) = / (1 - 4x) 2<fa; = . ^ _^ YC = V C, 

J 4 ^ 2 

and since • C(0) = 0, C = ^. Thus 

1 VI - 4.x 1 - VI -4a; 

= 2 2 — ' and c ( x ) = — Yx — ■ 

1 — Vl — 4a; 

(d) From the formula proved in Part (c), C{x) — , we deduce that 

xC(x) 2 - C(x) + 1 = 0. Since C(x) 2 = E ( E C v C i ) x "' this implies 

n— \p+g— n / 

CO / \ CO 

E E - E c « a; " + 1 = °< 

n— \jH-g= n / n— 



that is, 



=E E C p C q )x n -l-J2Cn 

n—1 \p-\-q—n—l / n—1 

= E(( E C^cJ-cJa:". 

n=l \ \p+<j=ri— 1 / / 



Thus, 



C„ = E for 11 ^ L 

p+q— n — 1 

(e) Apply induction on n. If n = then our expression is just ai which 
does not involve any multiplications and has only 1 = Co meaning. Assume that 
our statement is true for all products a\ * ■ ■ ■ * ak+i with k < n. Let A be one 
of the meanings of the product ai * • • • * a n +i- Break A at the multiplication 
performed last: A = B * C where B and C are obtained from a± * ■ ■ ■ * a,k and 
a fe+i * • • • * a n+i (1 < fc < n by specifying the order of multiplications. We see that 
the number of meanings for a\ * ■ ■ ■ * a n+1 is 

n 

^""^ Ck-lC n -k = C p C q = C n . 

k=l p+q=n—l 

(f) Apply induction on n. If n = 3 then the number of triangulations is 1 = 
C\. Assume that our statement is true for all convex fc-gons with k < n. Let 
P = A\A 2 . . . A n and consider a triangulation of P. Let A\A 2 Ak (3 < k < n) 
be the triangle of the triangulation that contains A\A 2 . Then our triangulation is 
determined by triangulations of the (k — l)-gon A 2 A 3 . . . A}, (which is nothing, if 
k = 3 and the (n — fc + 2)-gon AiAkAk+i . ■ . A n . Thus, the number of triangulations 
of P is 

Cn-3 + CksCn-k + Cn-3 = E] C p C q = C n -2- 
k—4 p+q—n — 3 
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3.1. Let n = ni + • • • + n k be a partition from the left box, and let rii = rrii- 2 di 
where m, is odd. Replace rii by rrii + ■ ■ ■ + rrij and then reorder the summands in 

V v ' 

2 d i 

a non-decreasing order. We get a partition of n from the right box. 

Let n = n\+- ■ -+n k be a partition from the right box. For an odd number to, let 
r m be the number of m's in the partition. Let r m = 2 dm - 1 +■ ■ - + 2 m > s , rf TOi i > • • • > 
d m .s > (this is equivalent to the presentation of r rn in the numerical system with 
the base 2). For every to, replace the group to + ■ ■ ■ + m by m-2 dm - 1 + ■ ■ ■ + m-2 dm - s 

and then reorder the summands in the increasing order. We get a partition of n 
from the left box. 

These two transformations between partitions in the left box and partitions in 
the right box are inverse to each other. 



3.2. 



Hence, 



M -. 1 1 1 

-s) =1 + ^ + ^7 + ^7 



n ( p s - 1 ) n + pS + p 2s + • 

= v - = T — 

k 2 ,k 3 ,ks,... V ' ' ' ' n=l 

3.3. 

d(2^3*3 5 *»...) -Eo<. P < fep (2^3 £3 5^...) 

= n (i+p+p 2 +---+a)= n 



pk p -\-l — 



v- 1 

pG {primes} pG {primes} 



3.5. First, 



-(« + fcm + lm) = fc2 + ^ + m2 2 (fc + ^ + m)2 = fc2 + /2 + m2 - 1 > 0. 

Second, the residues of the squares modulo 8 are 0, 1, or 4, and k 2 + l 2 + to 2 — 1 
cannot be 6 modulo 8. 

3.8. We restrict ourselves to Parts (a) - (d); we hope that the reader will be 
able to reconstruct the solutions of the remaining parts. 

Lemma 30.2. For every k, 

F 1 + F 2 + --- + F k = F k+2 - 2. 

Proof. Induction. For k = 1, it is true (1 = 3-2). HF 1 +F 2 -\ \-F k = F k+2 ~2, 

thcn J F 1 + F 2 + --- + F fe+1 = (F 1 + F 2 + --- + F k ) + F k+1 = F k+2 -2 + F k+1 =F k+s -2. 

Lemma 30.3. For every k, 



F 1 + F 3 + --- + F 2fc _ 1 = F 2k -1, 
F 2 + F 4 + ■ ■ ■ + F 2k = F 2k+ i — 1. 
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Proof. Induction. For k = 1 it is true (1 = 2-1, 2 = 3-1). If F x + F 3 + 
■■■+ F 2k _ 1 = F 2k - 1„ then F 1 + F 3 + ■ ■ ■ + F 2k+1 = (F\ + F 3 + ■ ■ ■ + F 2k _ 1 ) + 
F 2k +i = F 2k - 1 + F 2k+1 = F 2k+2 - 1. If F 2 + Fi + ■ ■ ■ + F 2k = F 2k+1 - 1, then 
F 2 +Fi + - ■ - + F 2k+2 = (F 2 +F4 + - • - + F 2k )+F 2k+2 = F 2k+ i — l + F 2k+2 = F 2k+3 — l. 

(a) Induction. For n = 1 it is true (1 = J*i). Let it be true for all numbers less than n, 
and let F k be the largest Fibonacci number < n. If Ft = n, then n = Ft is our partition; 
let F k < n. According to the induction hypothesis, n — Ft = F kl + ■ ■ ■ + Fk s , ki < 
■ ■ ■ < k a . But n < F k+1 implies n — F k < F k+ i — F k — F k -i; hence k a < k — 1, and 
n — F kl + ■ ■ ■ + Fk s + Fk satisfies our requirements. 

(b) The existence of a partition n = F kl + ■ ■ ■ + F ks with ki — fej_i > 2 is actually 
proved in Part (a): it is sufficient to include the condition fc, — ki-i > 2 in the induction 
hypothesis. Let us prove the uniqueness. Let 

F kl + ■ ■ ■ + F ks = F tl + ■ ■ ■ + F lt 

be two different partitions of the same number with ki — > 2 for 1 < i < s and 
£j — £j-i > 2 for 1 < j < t. If F ks — Fii t , we cancel them, and we keep on canceling until 
the largest parts of the partition become different. So, we can assume that k s < £ t . Then 

F kl + ■ ■ ■ + F ka < F ks + F fcs _ 2 + ■ ■ ■ + (F 2 or Fx) 
= F ks+1 - 1 < F kg+1 < F lt 
<F ei +--- + Fe t 

(we use Lemma 30.3); a contradiction. 

(c) Let n = F kl + ■ ■ ■ + F ks be some partition as in (a). If for some i, i > 1 and 
ki > ki-i +2, or i — 1 and k\ > 2, then we replace F ki by F ki -2 + F ki -i- We get another 
partition as in (a), with a bigger number of parts. If the condition in (c) is still not 
satisfied, we apply the same trick again, and the process must stop at some moment, since 
the number of summands cannot grow infinitely. This prove the existence of a partition 
with the required property; let us prove that it is unique. 

Let 

F kl + ■ ■ ■ + F ks = F (l + ■ ■ ■ + F tt 

be two different partitions of the same number with ki < 2, £i < 2, ki — ki-i < 2 for 
1 < i < s, and lj — F,_i < 2 for 1 < j < t. As in Solution of (b), we may assume that 
k s < t t . Then 

F kl + ■ ■ ■ + F ks <F- L +F 2 + --- + F ks = F ks+2 - 2 

< F ks+2 - 1 = F ks+1 + F ks ^ + ■ ■ ■ + (F 2 or Fi) 
<F it +Fe t - 2 + --- + (F 2 or F) 
<F tl +--- + F tt 

(we used Lemmas 30.2 and 30.3); a contradiction. 

(d) It is shown in Solution of (c) that any partition n = F kl +■ ■ • +F ks , 1 < ki < ■ • • < 
k s , can be reduced to the partition described in (c) by a finite sequence of replacements 
F k — > Ffc_i + Ffe_2 applied to k — ki such that ki > fcj_i + 2 or i = 1 and k\ > 2. 
Conversely, any partition as in (a) can be obtained from a partition as in (c) by finitely 
many replacements F k + F k+ i — » F k+2 applied when k t = k, k i+1 — k + 1, and k i+2 > k + 2 
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(or i + 2 > s). For positive integers 31,32, ■ ■ ■ ,3q put 

S{ji,32, ...,j q ) = {Fi + ■ ■ ■ + F jl ) + {F jl+2 + ■■■ + F jl+j2+ i) + ... 

* ' V v ' 

31 32 

+ (Fj 1 +--+j q _ 1 +q H h Fj 1+ ... +jg+q -i) 

„ ' 

3q 

T(ji,j2, ...,j q ) = (F a + • • • + F jl+ i) + (F jl+3 + ■■■ + F jl+j2+2 ) + . 



+ (F jl , 



31 
■+3q- 



1+9+1 H + Fj 1 + ... + j q+q ) 



3q 

According to Part (c), every n is either S(ji, . . . ,j q ) or T(ji, . . . ,j q ). We will consider 
the case when n = S(ji, . . . ,j q ) (the proof for n = T(ji, . . . ,j q ) is the same, up to 
the obvious change of notations and a shift of indices). For n — S(ji, . . . ,j q ), we put 
K n = K(ji, . . .,j q ), H n = H(ji, . . .,j q ), and K n - H n = M(ji,. . .,j q ). We begin with 
the case q = 1, that is, n = S(j) (which is equal to Fj + 2 — 2, but is not important for us). 
In this case, all the partitions of n into distinct Fibonacci numbers are 

n = Fi + ■ ■ ■ + Fj = Fi + ■ ■ ■ + Fj-2 + F 3+1 
= Fi + ■ ■ ■ + Fj-4 + Fj-i + F j+ i 

F 3 + F 5 + F 7 + ■ ■ ■ + F j+1 , if 3 is even 
Fi + F 4 + Fa + ■ ■ ■ + F j+ i, if 3 is odd 

We see that K n and H n are, respectively, the numbers of even and odd numbers among 
IJ + 1 ] 



3,3 - 



From this, 



K n 

Hn 



m, 


if 


n 


m + 1, 


if 


n 


m, 


if 


n 


m + 1, 


if 


n 



Am^m + 2, or 4m + 3, 
4m, 



and hence 



f X ' 

M(j) = K n ~H n = i -1 

I 0, 



if n = mod 4, 
if n = 1 mod 4, 
if n = 2 or 3 mod 4. 



Let now n = S(ji, . . . , j q ), q > 2; we may assume that \M(hi, . . . , h r )\ < 1 whenever 
r < q. Any partition of n into distinct Fibonacci numbers can be obtained from the 
partition of S(ji, . . . ,j q ) shown above by a sequence of steps Fk + F^+i — » Fk+2- This 
sequence either contains the step f^-i + Fj 1 — » Fj 1 j r \ or does not contain this step. 
Accordingly, we break each of K n and H n into two parts: K' n + K'„ and H' n + H". If this 
step is not applied, then the resulting partitions 1-1 correspond to that of S(j2, ■ ■ ■ ,j q ) 
and the parity of the number of summands differs from that for n by the parity of 31 . In 
other words, 

K:~H: = {-i)^M{j2,...,j q ). 

If Fj 1 -i +Fj 1 — > Fj 1 +i is one of our steps, then we can begin with this step. Further steps 



for Fi H h F n - 2 and (F jl + 

K 

H„ 



F, 



are performed independently. Hence, 



= K(ji-2)K(j2 + l,33,.- 
= K(ji-2)H(j 2 + l,j3,.- 



,j q ) + H(ji-2)H(j2 + l,j3,. 
,j q )+H(ji-2)K(j 2 + l,j3,- 



■Jq) 
■Jq) 



and 



Thus, 



K-H' n = M(ji - 2)M(j 2 + l,j 3 , ■ ■ -,j q ). 



M(j u . . .,j q ) = (-l) J1 M(j 2 , ...,j q ) + M(ji - 2)M(j 2 +l,...,j q )- 
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If ji = or 1 mod 4, then M(ji - 2) = and 

Af(j'i. ■ ■ ■ ,U) = (-l) jl M(j 2 , . . . ,j q ) = or ± 1. 
If ji ee 2 or 3 mod 4, then M ( ji - 2) = (-1)*' 1 , and 

...,j q ) = (-l) jl (Af(j2, ■ • • , U) + M(j 2 + l,j 3 , . . . ,j q )) 

= (-i)« + (-i) j2+l ) M(j 3) • • .j q ) 

+(-l) J1 (M(j 2 - 2) + M(ja - 1)) M(j 3 + 1, j4, • • • , j,). 
Since (-1) J2 + (-1) 32+1 = and M(j 2 - 2) + M(j 2 - 1) = or ±1 (for any j 2 ), we have 

M(j 1 ,...,j q ) = 0or ±1 

which completes the proof. 
LECTURE 4. 

4.1. One can take ar + 3ra: + (r — 1) = with r odd (if r is even, the formula works 
perfectly well, but the roots will be rather half-integers than integers). There are other 
solutions. 

4.4. The formula is 

x = \ — sinh ( — sinh -1 ( — — — 

V 3 V3 V Wy/Pj, 

It works for p > and gives one solution. By the way, sinh -1 = ln(x + \/l + x 2 ). 

4.5. (a) The auxiliary cubic equation is y 3 — 12y + 16 = 0; its roots are —4, 2, 2 (can be 
found either by the formula or by guessing) . The plus-minus roots of the given equation 

arc ^ — - — - = ±1, ±1 ± iV2. The roots of our equation are 

-1, -l,l + iV2, l-iy/2. 

(b) The auxiliary cubic equation is y' — 4y — Ay + 16 = 0, the roots are —2, 2, 4 
(found by guessing). The plus- minus roots of the given equation are 



2 2 V 2 

It can be checked directly that 

^+fl+^ 



2 V 2 
is a root. Hence, the complex conjugate 

V2 _ A , x/2 



2 V 2 

is also a root. The remaining roots are not minus the roots found, and the sum of all roots 
is 0. This implies that 

are the roots. 

(c) The auxiliary cubic equation is y [i - 7696y + 230400 = 0, its roots are 36, 64, -100. 
The plus-minus roots of the given equation are 

±10±6i±8i f ±5±i 
= ±5 ± 3i ± At ■ 



±5±7i 

An easy checking shows that — 5 + i is a root of our equation, and the same arguments as 
in (6) show that all the four roots of the given equation are 

-5±i,5±7i. 
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4.6. The auxiliary cubic equation will be 

y A ~ Py 2 ~ 4ry + (q 2 - Apr) = 0. 

If j/i,jy2,y3 are the roots of this equation, then the eight numbers ±Xi (where xi, X2, X3, xa 
are the roots of the given equation) are 

±V-yi - V2 ± Vyi - y-A ± V-y2 - 2/3 
2 

LECTURE 5 

5.1. The radical solution of the equation x 4 + qx + r — is presented in 7 equalities 
below (as usual, e is — — 77-^— )■ 



*? = 


64r 3 
- 27 + 


i 2 
4 


£4 = 


— X2 — 2:3 


x\ = 






*§ = 


—EX2 — exz 


4 = 


4- 




x% = 


—EX2 — ex s 






x 7 = 


X4. + X 5 + X 6 




The number of solutions is 24. (Why not 2-3-3-2-2-2 = 144? An alternative choice of x\ 
will lead only to switching x 2 and x 3 ; similarly, replacing £2,2:3 by ex2,ex 3 or by ex 2 ,ex 3 
will result in a permutation of X4,x$,xg .) The 24 solutions are solutions of 6 equations, 
x 4 ± qx + r — 0, x 4 ± qx + er — 0, x 4 ± qx + sr = 0. 

5.2. Let S be the set of partitions of a 4-element set into 2 pairs: {12/34}, {13/24}, {14/23}. 
It is easy to check that every even permutation of the set {1, 2, 3, 4} gives rise to an even 
(cyclic) permutation in S. Since cyclic permutations commute, a commutator of even 
permutations must be an identity on S. There are four such permutations: the identity 
and (2, 1, 4, 3), (3, 4, 1, 2), (4, 3, 2, 1). All of them are commutators of even permutations: 

(2,1,4,3) = [3,1,2,4), (4,1,3,2)], 
(3,4,1,2) = [(3, 1,2, 4), (1,3, 4, 2)], 
(4,3,2,1) = [(3, 1,2, 4), (1,4, 2, 3)]. 



LECTURE 6 
6.5. Let 



.(^ = 1 + - + — + ... + - 



Then = = /„ - x n /n\ Therefore 

(Ue- x Y = (fn-fn)e- x = - 

If n is even then the last function is everywhere negative (except x — where it equals 
zero). It follows that f n e~ x is a decreasing function. Since \im x ^ +cxi (f n e~ x ) — 0, one 
has: f n e~ x > for all x, and hence /„ has no roots. If n is odd then f n {x) is negative for 
very small x and positive for very large x, hence /„ has a root. If it has more than one 
root then, by Rolle's theorem, f' n has a root. But f' n = f„-i, and we already proved that 
fn-i has no roots. 

Alternatively, it suffices to prove that /„ cannot have two consecutive negative roots. 
Assume that a and 6 are negative roots. Then 



Ma) = &(a) + ^ = 0, Mb) = Mb) + -7 = 0, 



and hence f' n (a) and f' n {b) are either both positive (if n is odd) or negative (if n is even). 
But the signs of the derivative at the consecutive roots are opposite. 
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6.7. Denote the number of sign changes in the sequence ai,...,a n by S and the 
number of roots of f(x) by Z. Argue by induction on 5. If S = then obviously Z — 0. 
Let fc be such that a k a k+ i < and choose A € (A fc , A fc+ i). Set 

g(x) = e Xx (e- Xx f(x))' = ^(A, - A)e A ^. 

Let s and z have the same meanings for g as 5 and Z for /. By Rolle's theorem, z > Z—l. 
The coefficients of t; are 

— ai(A — Ai), . . . , — afc(A — Afc), afe + i(Afe+i — A), ... , a n (\„ — A), 

and hence s = S — 1. By the induction assumption, s > 2, therefore S* > Z. 

6.9. We assume that f(x) is in general position: no two derivatives f^\x) and f^\x) 
have common roots; in particular, no derivative has a multiple root (the argument in the 
general case is similar, see [64]). Let x vary from a to 6. If x passes a root of /, say, 
c, then, near point c, the signs of the sequence f{x), f'(x), f"(x), . . . , f ( - n \x) are those of 
(x — c)g(c) , g(c) , g 1 (c) , g" (c) . . . where f(x) = (x — c)g(x). As x passes c the number of 
sign changes in the latter sequence decreases by 1. 

Assume now that x passes a root of f('' with i > 1, again denoted by c. We need to 
show that the number of sign changes in the sequence (x), (x), f^ 1+1 \x) changes 

by an even non-negative number. As before, there is a sign change in (f (i) (x), / (!+1) (:r)) 
left of c, and none right of c. As to (x), f^(x)), the sign changes here are the same 

as in (/^ _1 '(c), (a; — c)g(c)) where f^\x) = (x — c)g(x). If the signs of / <l_1) ( c ) an <i 9(c) 
coincide then, as x passes c, the number of sign changes decreases by 1, and if these signs 
are opposite then the number of sign changes increases by 1. This implies the claim. 

LECTURE 7 

7.3. Arguing as in the proof of Theorem 7.1, it suffices to find a function ax + b such 
that f(x) — e x — ax — b has equal maximum values at the end points and 1, and the 
minimal value opposite to this maximum. Thus 1 — b = /(0) = /(l) = e — a — b, and 
therefore a = e— 1. To find the minimum, set f'(x) = 0, hence e x = a. Thus the minimal 
value is a — a In a — b. Therefore a — a In a — & = b — 1, and the least deviation of f(x) from 
zero equals 

2- e +( e -l)ln(e-l) 

2 ~ 

7.8. (A sketch.) A linear substitution (f>(x) = ax + b changes a polynomial f(x) to 
j(x) = (jT 1 o / o cf> = (f(ax + b) - b)/a. Iifog = gof then /o g = g o /. Let /„ be a 
sequence of commuting polynomials, deg /„ = n. Applying a linear substitution, one can 
change f2(x) to x 2 + 7. Let f:i(x) = arc 3 + for 2 + ca; + d. The equality /2 o / 3 = / 3 o / 2 is 
equivalent to the system of equations 

{a 2 — a, ah — 0, b 2 + 2ac = 3a7 + b, ad + be — 
c 2 + 2bd = 3a7 2 + 267 + c, cd = 0, d 2 + 7 = a 7 3 + 6 7 2 + c 7 + d 

This system has only two solutions: 7 = or 7 = — 2. 

In the first case, f2(x) = x 2 , hence f„(x 2 ) = f 2 (x) and more generally, f„(x 2 ) = 
f 2 (x) for all n, fc. Therefore every root of f n (x 2 ) is a root of f n (x). If has a root 

other than then the set of distinct (complex) roots of the polynomials f n (x ), fc = 
1,2,... is infinite. But f„(x) has finitely many roots, and thus f n (x) = ax n . Since 
fn(x 2 ) = f 2 (x) one has a = 1. 

If /2(a;) — x 2 — 2 then f n (x 2 — 2) = f 2 (x) — 2 for all n. Differentiate to obtain 
z/^ 2 - 2) = f n {x)&(x). Set 5n (x) = (4 - z 2 )/^) + n 2 (/ 2 (:r) - 4). On the one 
hand, the leading coefficient of g n (x) is zero so g n has degree less than 2n. On the other 
hand, one can check that g n {x 2 — 2) = g n (x)f 2 (x), and hence deg g„ is 2n. It follows that 
g n (x) = 0. Therefore /„ satisfies the differential equation (4 — x 2 )f^(x) + n 2 (f 2 (x) — 4) = 
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which can be explicitly solved: f n (x) = 2 cos (n arccos(x/2) + c). To find the constant, 
use the equality f n {x 2 — 2) = f„(x) — 2 which implies that /„(2) = 2, and hence c = 0. 

LECTURE 10 

10.1. (a) See Figure 30.8. 




Figure 30.8. Solution to Exercise 10.6 (a) 
10.1. (b) See Figure 30.9. 




Figure 30.9. Solution to Exercise 10.6 (b) 
10.4. (a) Answer. 

L = fj p{<t>) d<p, A = i P(0 (P(0) + p" (4>) ) #■ 
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10.5. Assume that f n+1 ' (t) > on /. Let a < b and suppose that g a (x) = gb(x) for 
some x. One has 

f w - 1 - - 1 f£> - ^ - «)■• 

i=0 ' i=0 y 

and hence (dgt/dt)(x) > (except for t = x). It follows that gt(x) increases, as a function 
oft, therefore g a {x) < gb{x). This is a contradiction. 

10.6. Consider the sequence of functions /, = (~l) q (kl) 2q (/), namely, 

f q (x) = a k coskx + b k sinfcx + I - — - J (a k +i cos(fc + l)x + b k+1 sin(fc + l)x) 
+ •••+— (a n cos nx + b n sin nx) . 

w 

By the Rolle theorem, for every q, one has: Z(f) > Z(f q ). For large q, the function 
f q (x) is arbitrarily close to a k coskx + b k sinkx, therefore f q has 2k sign changes. Thus 
Z(f) > 2k. 

LECTURE 11 

11.6. In both cases, this equals the number of tangent lines from a given point to the 
envelope of the family of lines that bisect the area of the polygon. This envelope is a 
concave triangle made of arcs of hyperbolas, and the answer is either one or three, if the 
point is not on the envelope, and two if it lies on the envelope. 

11.7. Order the vertices cyclically Vi, V2, - - - , V„ and assume that the correspondence 
side 1 — * opposite vertex is one-to-one. Let side Vi-iVi be opposite to vertex Vj. Then 
side ViVi+i is opposite to the next vertex Vj+i, otherwise the correspondence between the 
sides and the opposite vertices cannot be one-to-one. It follows that there exists k such 
that, for every i, side ViVi+i is opposite to the vertex V i+k (of course, we understand the 
indices cyclically mod n). It is clear that side Vi+k-iVi+k is opposite to the vertex Vi, and 
hence (i + fc — 1) + k = i mod n. It follows that n is odd. 

LECTURE 12 

12.3. Orient one curve and consider the number of intersections of its positive tangent 
half-line with the other curve. As the tangency point traverses the first curve, this number 
changes as described in Section 12.2. Computed for both orientations of the first curve, 
the total contribution of each outer double tangent is 2, of inner double tangent is -2, and 
of each intersection point -2. Thus t+ = t- + d. 

12.5. The solution is based on [40]. Let 7 be a positive even number and T + — T_ — 
1/2 — D. We need three preparatory constructions. First, let I = 2,D = 2 and T_ 
arbitrary. A curve is shown in Figure 30.10, left or right, depending on the parity of T_. 

Next, let T_ = 0, D — and I an arbitrary even number. A curve is shown in Figure 
30.11. 

Third, one can adjust two nearly parallel arcs to obtain additional self-intersections 
and without affecting / or T_ as shown in Figure 30.12. 

Now, starting with the first construction with the desired T_, adjust it by the second 
and third constructions as shown in Figure 30.13. This solves (a). 

To prove (b), let 7(a) be a parameterization of the given curve by the angle made by 
its tangent direction with a fixed direction in the plane. Denote by w the winding number; 
the parameter a varies between and 2nw. One has a double tangent line at points 7(a) 
and 7(/3) if and only if the vectors 7' (a) and 7(/3) — 7(0) are parallel. This implies that 
j3 = a + nk, and the double tangent is inner if and only if k is odd. For a fixed k, let 

(30.4) = {a I 7 ; (a) = t(-y(a + nk) - 7(a))} 

with t positive or negative, respectively. We make three claims whose proofs are left to 
the reader: 
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Figure 30.10. The case of I = 2, D = 2 




Figure 30.11. The case of T_ = 0, D = 




Figure 30.12. Making additional self-intersections 
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construction 
two 



FIGURE 30.13. Combining the preceding constructions in one curve 

(1) the points of A~£ and alternate; 

(2) if distinct a, (3 G A+ then \/3 - a\ > n; 

(3) Af and A^ w _ l are empty. 

One has: 

I<fc<2™-1; fc odd 

By claim (1), \A%\ = |AjT|; by claim (2), < 2w - 1 for each fc; and then , by 

claim (3), T_ < (w - 2)(2w - 1). By Exercise 12.11, w < D + 1, and it follows that 
T_ < (2D + — 1). Finally, the number T_ splits into two summands, according as 
t > or t < in (30.4). These summands equal 

I<fc<2u>-1; fe odd 

respectively. By claim (1), the two sums are equal and hence T_ is even. 

Considering (c), let T_ be even and T_ < D(D — 1). Then there exists n < D such 
that ("j 1 ) <T_/2< (2). Setfc= (™)-T_/2. Thenfc<n~l. Setg = D-n. A desired 
curve is constructed in three steps. First, consider a closed curve without inflections which 
makes q + 1 loops, then insert n — k small loops into the inner large loop of the curve, 
and then insert k very small loops into one of the small loops. See Figure 30.14 where 
q = 2, n — 8, k = 3. This curve has q + n = D double points. Every pair of small or very 
small loops contribute two inner tangents, except for the pairs (small loop, one of the very 
small loops inside it). Thus the number of inner double tangents is 2 ((") — k) = T_. 

12.7. See [89]. 

12.8. See [29]. 

LECTURE 13 

13.1. The equation of the plane II has the form 

A(x - x(t )) + B{y - y(t Q )) + C(z - z(t )) = 0. 
The plane II contains the tangent line 

x = x(t ) + ux'(to), y = y(t ) + uy'(t ), z = z(t ) + uz'(t ) 
(u is a parameter on the line) if and only if 

Ax'(to)+By'(to) + Cz'(t ) = 0. 

Consider the function 

h(t) = A(x(t) - x(t )) + B{y(t) - y(t )) + C(z(t) - z(t )) 
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Figure 30.14. Solution to Exercise 12.5 (c) 



(on the curve). Obviously, h(to) = h'(to) = 0, and, since the curve passes at P from one 
side of the plane to the other one, it changes sign at t = to- Hence, the second derivative 
of h also equals at to- Thus, 

Ax"(t ) + By"(t ) + Cz"(t ) = 0, 

that is, the plane contains not only the velocity vector, but also the acceleration vector. 
It is the osculating plane. 

13.2. If x = x(t), y — y(t), z — z(t) are parametric equations of the cuspidal edge, 
then the parametric equations of the developable surface (made of the tangent lines to the 
curve) are 

x — x(t) + ux'(t), y = y(t) + uy'(t), z — z(t) + uz'(t). 
The tangent plane to the surface at the point (t, u) is spanned by the vectors 

(x' t ,y' t ,z' t ) = (x'(t)+ux"(t),y'(t)+uy"(t),z'(t)+uz"(t)) 
(x' u ,y' u ,z' u ) = (x'(t),y'(t),z'(t)). 

It is the same as the osculating plane to the cuspidal edge. 

13.3. (A sketch.) Let II (t) be our family. The intersection line of the planes IT(i), H(t+ 
e) (e 7^ 0) has a limit £(t) C II(i) as e — * 0. The point of intersection of the three planes 
II(t), II(t + ei), n(t + e 2 )(ei / e 2 , £i / 0, £ 2 / 0) has a limit j(t) G t{t) as ei -» 0, £ 2 -» 0. 
The union of lines l(t) is a developable surface whose tangent planes are 11(f); the union 
of points j(t) is a curve whose osculating planes are II(t). 

13.5. The parametric equations of the surface made of the tangent lines to the given 
curve is 

x — t + u, y — t 3 + 3t 2 u, z = t 4 + 4t 3 u. 



436 



SOLUTIONS TO SELECTED EXERCISES 



The intersection of this surface with the plane x = c (in the coordinate system y, z) is 

y = t a + 3t 2 = i 2 (3c-2i), 

z = t 4 +4t 3 (c-t) = i 3 (4c-3i). 

The derivatives y' — 6t(c — t),z' — 12i 2 (c — t) have two common zeroes: t = and 
t = c; thus the curve has two cusps, (0,0) and (c 3 ,c 4 ). Accordingly, the surface has two 
cuspidal edges: the given curve x = t, y = t 3 , 2 = t 4 and, more surprisingly, the x axis 
x = t, y = z = 0. 

Besides two cusps, our curve has also a self-intersection. Indeed, the system 
t?(3c-2ti) = *!(3c-2t 



t?(4c-3t) = *|(4c-3t 2 ) 



besides the obvious solution ti = t 2 , has solutions 

. 1 + . 1-yg ,. l-y/3 . 1 + V3 
<i = 2 C ' 2 = 2 C ' = 2 C ' = 2^ C ' 

1 i V^3 

which means that at the parameter values c the values of the coordinates y, z are 

c 3 c 4 

the same. An easy computation shows that these values are y = -^, z — Thus, 
besides two cuspidal edges, the surface has a self-intersection along the curve 

t A t 4 
x = t, y = — , z = - — . 

We leave to the reader the pleasant work of visualizing these results. 

13.6. The solution is described in [82]. For (a), unfold the disc to the plane to obtain 
a plane disc foliated by straight segments, the ruling of the developable surface, and a 
curve 7 on it. We need to prove that there are two intersection points of 7 with a ruling 
at which the tangent lines to 7 are parallel. Since the tangent planes along a ruling of a 
developable surface are the same, the respective tangent lines to 7 in space will be parallel 
as well. 




Figure 30.15. Curve and rulings 



Let l t , t G [0, 1], be the family of lines foliating the domain; think of these lines as 
oriented from left to right. Assume that 7 lies between lo and ?i, and that these lines touch 
the curve, see Figure 30.15. Let C+(t) and C-(t) be the rightmost and the leftmost points 
of 7 PI h- The curves C±(t) are piecewise smooth. Let a±(i) G [0,7r] be the angle between 
C±(t) and l t . The functions a±(t) are not continuous but their discontinuities are easily 
described. The three types of discontinuities of a_(t) are shown in Figure 30.16: in each 
case the graph of a_(t) has a descending vertical segment. Likewise, there are three types 
of discontinuities of a + (t), and in each case the graph has an ascending vertical segment. 
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Note that, for t close to 0, a_(t) is close to 0, while a+(t) is close to it. 
close to 1, is close to tt and a + (t) is close to 0. 



Similarly, for t 




Figure 30.16. Discontinuities of the functions a±(t) 



We claim that that the graphs of a-(t) and a + (t) have a common point not on a 
vertical segment of either graph. Approximate both functions by smooth ones so that the 
vertical segments of the graphs become very steep: a'_ (t) <JC and a' + (t) > on the 
corresponding small intervals. Consider the function (3{t) = a + (t) — For t near 0, 

f3(t) > 0, and for t near 1, fi(t) < 0. Let to be the smallest zero of /3(i). Since (5 changes 
sign from positive to negative, f3'(t ) < 0. Therefore t is not on the segments where 
a'_(£) «0or a'+{t) 0. Thus to is a desired common value of the two angles. 

For (b), take an equilateral triangle in the plane and draw three lines near each corner, 
cutting across and not parallel to opposite sides. Fold smoothly along these lines (that is, 
approximate a fold line by a cylinder of a small radius) so that the obtained surface consists 
of a flat hexagon with three triangular flaps going more-or-less vertically. The curve 7 is 
a simple smooth convex curve approximating the perimeter of the original triangle. After 
the flaps are folded, the curve doesn't have parallel tangents in space, see Figure 30.17. 
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15.3. Denote the unit tangent, the unit normal and the binormal vectors of the arc 
length parameterized curve j(t) by T(t),N(t) and B(t). One has the Frenet formulas: 

T'(t) = n(t)N(t), N'(t) = -K{t) - r(t)B(t), B'(t) = r(t)N(t). 

Let v(t) be a vector along the ruling at point 7(t). The surface generated by the rulings 
has a parameterization r(t,s) = y(t) + sv(t). Then r t = 7' + sv' , r 3 — v are tangent 
vectors to the surface. Since the surface is developable, the normal v(t) along a ruling 
remains the same. Then v(t) is orthogonal to v and to 7' + sv' for all s, and hence to 7' 
and to v' . Therefore the vectors v,v',j' are coplanar. It is easy to express v in terms of 
T(t),N(t) and B(t): 

v = T cot (3 + N cos a + B sin a. 

Differentiate using the Frenet formulas: 

v = — T(k cos a + /3'cosec 2 0) + N(k cot /3 + (r — a) sin a) + B(a' — r) cos a. 

It is now straightforward to compute the determinant of the vectors v',v,T which equals 
/tsinacot j3 + r — a' . Equating to zero yields formula (15.2). 

15.6. If there is a hole inside the fold then the straight rulings that go inside it have 
their other end points on the boundary of the hole. If there is no hole then the other end 
points of the rulings must lie on the fold as well. Since the family of rulings is continuous, 
there will be rulings that are tangent to the fold, and this contradicts to the fact that the 
rulings make non-zero angles with the fold. 

LECTURE 16 

16.1. Answer, a hyperbolic paraboloid. 

16.2. Let ABCD be the quadrilateral given. Take the union, for all planes IT parallel 
to the lines AB and CD, of lines KM where K and M are points of intersection of II 
with the lines BC and DA. 

16.6. (a) The ruling chosen is projected onto a point, denote it by P. The family 
containing this ruling is projected onto a family of parallel lines not passing through P. 
The other family is projected onto the family of lines passing through P except the line 
parallel to the images of the rulings of the first family. 

(b) The projections will form two families of parallel lines. 

LECTURE 17 

17.1. The 12 lines passing through the points (0,0,0), (1,0,0), (0,1,0), (0,0, 1) are 

x — 0,y = — z; x — l,y — 0; y = l,x = 0; z = l,a: = 0; 
y = 0,x = — z; x = l,z = 0; y = l,z = 0; z = l,y = 0; 
z — 0,x — —j/; x = l,y = — z; y = l,x = —z; z = l,x = —y. 

To find the remaining line we use the fact that each of the 27 lines intersects 10 other 
lines. This shows that every known line intersects some unknown lines. Say, take the line 
x = 0, y = z. Its general point is (0, t, — t). Thus, for some values of t, and some p,q,r, 
the point x = pu, y = t + qu, z = —t + ru must belong to the surface for all u. Plug the 
point into the equation of the surface. We will get a polynomial in u of degree 3 without 
the constant term (for u = 0, the point lies on the surface for every u). Equate to the 
coefficients at u,u 2 ,u 3 , and determine, for which t the system has a non-zero solution in 
P,<l,r. 

In this way, we find one more line on the surface; it is given by parametric equations 

V5 + i , VE-i ^ 1-V5, ... 

x = — h — - — t, y = — - — t, z = l + t. 

The remaining 11 lines can be obtained from the last one by permutations x, y and z and 
replacing \/5 by —a/5- 
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17.2. It is easy to find 12 lines on our surface. One of them is given by the equations 

2^2 

y = ax, z = b where b = — , [3a + ba + (3 — 0; 11 more can be obtained by varying the 

values of a and b and by permutations of x, y, and z. One more line can be found by 

2/3 

the method described in the previous solution; it is z — x H h b, y = —2/3; 11 more 

a 

by varying the values of a and b and by permutations of the coordinates. Three lines are 
infinite (this follows from the fact that our surface, for big x, y, z, is asymptotically close 
to xyz = which is the union of three planes. The intersection with the infinite plane 
consists of three lines. 



LECTURE 18 

18.6. Let the zero element E be one of the inflection points of the cubic curve. Given 
a point A of the cubic curve, let B be the third intersection point of the line AE with the 
curve. We claim that A + B — E. Indeed, since E is an inflection, the third intersection 
point of the tangent line at E with the cubic curve is E. (This implies that points A, B 
and C are collinear if and only if A + B + C — E.) 

Let A be an inflection point. Then the construction for the addition of points implies 
that A + A = — A, that is, 3A — E. Conversely, if 2A — —A then the third intersection 
point of the tangent line at point A with the cubic curve is again A, hence this tangent line 
has third order tangency with the curve and A is an inflection point. Thus the inflection 
points are precisely the points of order three, satisfying 3A = E. Finally, if A is an 
inflection point then 3A = E and hence 3(— A) = E. Therefore — A is an inflection point 
as well, and the three inflection points, A, E, —A, are collinear. 



LECTURE 19 

19.7. (c) Choose a direction, say, a, and consider the support lines to the given curve of 
constant width 7 having the directions a, a + ir/3, a + 2ir/3, a + %, a + 4-K /3 and q + 5/t/3. 
These lines form a hexagon with all the angles equal to 2n/3. Let d be the distance 
between the opposite, parallel sides of the hexagon (equal to the diameter of 7) and a, b 
two adjacent sides. Then d — (a + &)cos7r/6. Therefore the sum of any two adjacent 
sides is the same. Hence the odd-numbered sides are of equal length, and so are the even 
numbered sides. Let c(q) be the difference of the former and the latter. 

Let a continuously increase by 7r/3. Then the sign of c(a) changes to the opposite. 
Therefore, for some intermediate value of the angle, c(q) = and the hexagon is regular. 

(d) Let 7 be a curve of constant width d and ABCDEF be a regular circumscribed 
hexagon. Let 71 be a Reuleaux triangle inscribed into ABCDEF and touching it at 
points B,D,F. Choose the origin at the center of the hexagon and the horizontal axis 
parallel to AB. Denote the support function, the curvature radius and the area of 7 by 
p(4>) , p((j>) and A respectively, and let pi(0), pi(<f>), A\ be those for 71. Then, by Exercise 
10.4, p(4>)=p"(4>) +p(fl and A = (1/2) f p(<f>)p(<P) d<f>. 

Considering the arc BD of the Reuleaux triangle 71, it is clear that p(<j>) > Pi(4>) 
for 7r/3 < 4> < 27r/3, and similarly for n < 4> < 4ir/3 and 5n/3 < <j> < 2tv. Considering 
the centrally symmetric Reuleaux triangle, we conclude that p(cj> + it) > p\ (0 + 7r) for 
tt/3 < 4> < 2n/3, 7T < (f> < 4tt/3 and 5tt/3 < <j> < 2n. Therefore 2 A = 

/ p(4>)p(<j>) d4> = / +/ +/ )(p(ct>)p(4>)+p(<t> + 7r)p( ( t> + n)) d<t>> 

JO \Jtv/3 Jtt ./ 57T/3/ 



/ r2ir/3 rin/3 i-2-k \ 

/ + / + / W)+P(* + 7T))P1(0 #■ 

\Jfr/3 Jtt J'ott/3J 
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Since p(<f>) + p(4> + ■k) = d we have p(4>) + p(4> + 7r ) — d, and therefore the last integral 
equals 

d / +/ +/ \p 1 (4>)d<t>= Pl (4>) P i(<t>)d^ = 2A 1 , 

\Jtt/S J n ./ 571-/3/ JO 

as needed. 

19.9. Let 7(i) be the arc length parameterization of the curve. Then | < 1, |7'(t)| = 
1 and |7"(t)| = |fc(t)| for all t. One has: 

L= / L 7'-7'di = 7-7'|o - [ L ~/--y"dt<2+ f L \ 1 \\ 1 "\dt<2 + f L \k\dt = 2 + C, 
Jo Jo Jo Jo 

as claimed. 

19.10. The formulation is the same as in the plane (for a closed curve inside a unit 
ball, L < C), and the proof given in the solution of Exercise 19.9 applies. 

19.11. By approximation of a smooth curve by a polygonal one, it suffices to prove the 
statement for a single wedge. Let a be the exterior angle of a wedge. Since the measure 
dv is invariant under isometries of space, 1(a) = J C v dv does not depend on the position 
of the wedge but only on the value of a. It is clear that 1(a) is a continuous function of 
a and that I (a + /3) = 1(a) + I (j3) (since this additivity holds for each projection). A 
continuous additive function is linear: 1(a) = Ca. To find the constant C consider the 
case a — it. Almost every projection of such a wedge also has curvature 7T, and hence C 
equals the area of the unit sphere, 4/t. This implies the result. 




Figure 30.18. Proving the triangle inequality in Hilbert's metric 

19.12. Let ABC be a given triangle. Extend its sides to their intersections with the 
boundary of the convex domain and call the intersection points P,P\,Q,Q\,R,Ri, see 
Figure 30.18. We want to show that 

[P,A,B,Q!][Q,B,C, Ri] > [P U A,C,R]. 

Let X, Y, Z be the intersection points of the line P\R with the lines PR\,PQ and Q\Ri, 
respectively. The cross-ratio is invariant under a central projection. Projecting the line 
PQi to the line PiR from point Ri yields [P, A, B, Q\] = [X, A, C, Z]. Likewise, projecting 
QR 1 to PiR from P yields [Q,B,C,Ri] = [Y,A,C,X]. Since [X, A, C, Z] [Y, A, C, X] = 
[Y, A,C, Z], what we need to show is [Y, A,C, Z] > [Pi, A, C, R]. But this inequality holds 
since Y and Z are closer to A and C than Pi and R. 



LECTURE 20 
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20.4. When the loop is pulled down it minimizes its length and becomes a geodesic 
line on the cone. Cut the cone along the ruling through the point of the loop that is 
being pulled down and flatten the cone in the plane. One obtains a plane wedge, and the 
geodesic unfolds to a straight segment. If this wedge has the angle measure less than 7r 
then the segment lies within the wedge and the loop stays on the cone, but if this angle 
measure is greater than n then the loops slides off from the cone. 

Consider the plane section of the cone through its axis and let 2a be its angle. Cutting 
the cone along its ruling and flattening it yields a sector with perimeter length 2irl sin a 
where I is the length of a ruling. The borderline case is when this sector is half a circle; 
then 27r/sina = nl, and hence a = n/6. 

20.6. (a) The result will follow once we prove that 7 e is orthogonal to the normals to 
7 (the notation is introduced in Exercise 20.6 (c)). 

This claim is obvious if 7 is a circle - then each 7 e is a concentric circle. In the general 
case, one approximates 7 at each point by its osculating circle, and then the concentric 
circles are tangent to y e at the respective points. 

(b) Let C(7) be the total geodesic curvature of 7. If one proves that £(7*) = C(-f) 
then the result will follow from the Gauss-Bonnet theorem for spherical curves. 

To prove that £(7*) = C (--/), approximate the curve by a spherical polygon. So assume 
that 7 is a convex polygon. Let C be the polyhedral cone whose intersection with the unit 
sphere is 7 and C* the dual cone whose intersection with the unit sphere is 7*. Then £(7*) 
is the sum of the angles between the edges of C* and C{^f) is the sum of the complements 
to 7r of the dihedral angles of C . It remains to use Lemma 20.2. 

(c) Assume that 7 is a convex spherical n-gon (if 7 is smooth then the result is 
obtained by approximation). Let h, . . . , l n be the side lengths, oti, . . . , a n be the angles 
and . . . , (3 n the exterior angles of 7. The domain enclosed by 7 £ consists of n domains 
bounded by the sides of 7 and the arcs of great circles obtained from these sides by moving 
distance ir/2 in the orthogonal direction, and of n spherical isosceles triangles with equal 
sides e and the vertex angles n — on = Pi. The perimeter lengths of the outer boundaries 
of the former domains are equal to h cos e and of the latter ones to f3i sin e. Therefore 

%<0 = (%2 li ) cos £ + Pi) sine = 1 M cos e + (2tt - A(j)) sine 

(the last equality follows from the Gauss-Bonnet theorem). Since 7 e+7r /2 =7*, it follows 
from the Gauss-Bonnet theorem that 

A(j £ ) = 2tt - £(7 e+7r/2 ) = 2tt + (.(-/) sine - (2tt - A{y)) cos e. 

Alternatively, one argues that 

^ = ( <„),^ = «) 

(the second equality follows from the Gauss-Bonnet theorem), and therefore 

These second order differential equations are easily solved and the initial conditions 
uniquely chosen, which implies the result. 

(e) Again we assume that 7 is a convex spherical n-gon. Let oti, . . . , a n be its exterior 
angles. Then the total curvature, C(7) = The domain inside 7' is the union of 

the interior of 7 and n spherical isosceles triangles with equal sides n/2 and the vertex 
angle oti. The area of the former is a», therefore the area inside 7' equals Yl a i + -A (7) = 
(7(7) + A(-y) = 2-7T by the Gauss-Bonnet theorem. 

20.9, 20.10. See [31]. 
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21.2. Let (x,y,z) be a point of the unit sphere x 2 + y 2 + z 2 — 1 and (X,Y) be its 
stereographic projection on the plane. The explicit formula is easy to derive from similar 
triangles: 

_ 2X 2Y _ R 2 - 1 

x ~ WTi' y ~ WTi' 2 ~~ WTi 

where R 2 = X 2 + Y 2 . Consider a plane circle (X + a) 2 + (Y + b) 2 = c 2 . This equation 
can be rewritten as R 2 + 2aX + 2bY — d with d = c 2 — a 2 — b 2 . Further rewrite it as 

2aX 2bY (d+l)(R 2 - 1) _ d-1 
R? + 1 + E 2 + 1 + 2(R 2 + 1) 2 ' 

that is, as 2ax + 2by + (d + l)z = [d — 1). This is an equation of a plane (not through the 
North Pole (0,0, 1)); and the intersection of this plane with the sphere is a circle. Hence 
the preimage of a plane circle under the stereographic projection is a spherical circle. By 
continuity, one can conclude that the preimage of a straight line is a circle on the sphere 
that passes through the North Pole. 

To prove that the stereographic projection preserves the angles between circles, it 
suffices to assume that one of the circles, say, Ci, is a meridian. Let C2 be another circle 
on the sphere, and let X be an intersection point of C\ with Ci- Replace C2 by the circle 
C3 that passes through X, is tangent to C'2 at this point and passes through the North 
Pole. Let Co be the meridian tangent to C3 at the North Pole. Let Lq,Li and L3 be the 
images of Co,C\ and C3 under the stereographic projection; these are lines in the plane, 
L is parallel to L 3 . It suffices to prove that the angle between C\ and C3 equals the angle 
between L\ and L3. Indeed, the angle between L\ and L3 equals that between L\ and Lo 
(property of parallel lines); the latter equals the the angle between Ci and Co (obvious); 
and this angle equals the angle between C\ and C3 (two angles made by two circles are 
equal) . 

21.3. The idea is similar (and, in a sense, dual) to the proof of Theorem 21.1. 
Assume that P is circumscribed. Consider a face Ai,A 2 ,--- ,A n and let O be its 

tangency point with the sphere. Clearly, the sum of angles AiOAi+i is 2n. As before, we 
shall sum up all these angles over all faces, taking the ones in the white faces with the 
positive signs, and the ones in the black faces with the negative signs. Since there are 
more black faces, this sum, E, is negative. 

On the other hand, consider two adjacent faces with a common edge AB, see Figure 
30.19. The angles AOB and AO' B are equal. Indeed, revolve the plane AOB about the 
line AB (as if it were a hinge) until it coincides with the plane AO' B. This rotation takes 
point O to O' , see Figure 30.19, and hence the triangles AOB and AO' B are congruent. 




Figure 30.19. Solution to Exercise 21.3 



There are two kinds of adjacent faces: black-white and white-white; the former's 
contribution to E cancels, and the latter contribute a positive number. Thus E > 0, a 
contradiction. 
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21.4. Let k be the number of faces adjacent to every vertex, e the number of edges and 
6 and w the number of black and white vertices. Let us count, in two ways, the number 
of edges. On the one hand, there are k edges adjacent to every black vertex, so e = bk. 
For the same reason, e = wk, and hence b — w. 

LECTURE 22 

22.5. Let P be our polyhedron. Let Dehn(P) 7^ 0; more specificaly, let the coefficient 
of a, (g) Oj in Dehn(P) be c / (see the definition of Dehn(P) in Section 22.4). Let V 
be the volume of the polyhedron P and d be the diameter of P. Let, further, C be the 
sum of the absolute values of the coefficient of cn ® dj contributed by all the edges of Pin 
Dehn(P). Consider a tiling of space by polyhedra congruent to P. Fix in space some ball 
B of radius R, and let Q be the union of tiles whose intersection with B is not empty. Let 
N be the number of the tiles in Q. Since Q D B, 

Volume(P) _ 4ttP 3 
- Volume(P) ~~ 3V ' 

The absolute value of the coefficient of dj ® ctj in Dehn(Q) is JV|c| . On the other hand, 
the edges of Q all belongs to the copies of P which are contained in the domain between 
two concentric spheres of radii R + d and R — d. (See Figure 30.20.) 




Figure 30.20. Exercise 22.5: the outer solid contour symbolizes 
the polyhedron Q, the tiles between the two solid contours con- 
tribute into the Dehn invariant of Q 



The volume of this domain is 

- tt((P + df - (R - d) 3 ) = ^Tv(6R 2 d + 2d 3 ) = ^7rd(3P 2 + d 2 ). 

Hence, the number of tiles within this domain does not exceed ^^iSR + d ) ^ c 

absolute value of contribution of edges of these copies of P cannot exceed 8^(3^+ d ) 
C. Thus, 

4ttP 3 87rd(3P 2 + d 2 ) 2d(3P 2 + d 2 ) 

^■ |C| - W c ' |c| - W> °' 

and since R may be chosen arbitrarily large, this means that \c\ is less than any positive 
number. This contradicts to the positivity of |c|. See [48] and the references therein. 
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LECTURE 23 

23.1. There are as many black as white squares here. However, the left half of the 
figure contains more black squares and the right half more white ones (16 against 9, in 
each half). Suppose that there exists a tiling. Then at most one tile intersects the wavy 
line in the middle in Figure 23.24, and the left part of the polygon (except, possibly, one 
right-most square, adjacent to the wavy line) is tiled by dominos. But this is impossible, 
due to the lack of balance between the black and white squares. 

23.4. Argue as in the proof of the Theorem 23.1. That a tiling exists for n = 
0,2, 9, or 11 mod 12 is shown by explicit constructions for n < 12 and then extending 
from a tiling for n = 12fc to n — 12k + I where I = 2,9, 11, or 12. 

To prove that no tiling exists for other values of n, note first that a necessary condition 
for a tiling to exist is that the number of dots is divisible by 3: n(n + l)/2 = mod 3, 
and hence n = or 2 mod 3. Thus we need to consider n = 3, 5, 6, or 8 mod 12. The 
boundary words of the tiles are x 2 yx~ 1 yx~ 1 y~ 2 and xy 2 x~ 2 y~ 1 xy~ 1 . Their shadows on 
the hexagonal grid are closed. Assign to an oriented closed path on the hexagonal grid the 
sum of its rotation numbers about the hexagonal regions. These numbers are equal to ±1 
for the boundary paths of the tiles and to [(n+l)/3] for the boundary path of the region. It 
follows that if the region is tiled by m tiles then [(n + l)/3] = m mod 2. On the other hand, 
n(n + l)/2 = 3m, hence n(n+ 1)/2 = m mod 2. Therefore [(n + l)/3] = n(n + l)/2 mod 2, 
and it is easy to see that this congruence does not hold for n = 3,5, 6, or 8 mod 12. 

23.5. Argue as in the proof of Theorem 23.3. Define an additive function f(x) such 
that f(l) — 1, /(\/2) = —0.5 and f(x) = if x is not a rational combination of 1 and \J2. 
Define the "area" of a u x v rectangle as f(u)f(v). Then the "area" of the polygon in 
Figure 23.25 equals —0.75. If a polygon is tiled by squares then its "area" is non-negative, 
whence the result. 

LECTURE 24 

24.3. The following argument resembles the proof of Theorem 10.3. Let 7(0) and 
7i(</)) be the two ovals, parameterized by the directions of their tangent lines. Then 
7i(0) = h(4>)y(<f) where the function h is the ratio dsi/ds. We want to show that h has 
at least four extrema. 

Let fi((j>) and f2{4>) be the components of the vector- valued function 7(0). Since 71 
is a closed curve, 



hence f hf[d(j> = J hfydcj) = 0. It follows, by integration by parts, that f h! fi d(f> = 
/ h'h d<t> = 0. Clearly, / h' d<j> = as well. 

Assume that h' has only 2 sign changes. We can find a combination of the functions 
fi, ji and 1, say, afi + bf 2 + c, that changes sign at the same points as h! . The function 
afj + 6/2 + c has no other zeros - otherwise the line ax + by + c = would intersect the 
oval 71 at more than 2 points. Therefore afi + 6/2 + c has the same intervals of constant 
sign as h' and J hi '{afi + 6/2 + c) d<f> 7^ 0. This contradicts the previous paragraph. 

24.4. This is a discrete analog of the previous problem. Denote the position vectors of 
the vertices of P by Vi,. . . , V„ and let £j = \Vi+\ —Vi\. Set hi = Since P' is a closed 

polygon, ^ hi(Vi+i — Vi) = 0. It follows that 2~2(hi+i — hi)Vi = (discrete integration by 
parts). Set g t = h i+1 — hi. We claim that the cyclic sequence is either identically zero 
or it changes sign at least four times. Since Yl 9i — 0, there are at least two sign changes 
or gi — for all i. Assume that there are exactly two sign changes. Then there exist a 
line m such that for the vertices of P on one side of m one has Qi > and on the other 
side gi < and there is at least one positive and one negative value. Choose the origin 
on the line m. Then the point Yl 9^ nes on the side of m where gi > 0, and hence the 
vector 52<fcVj is not zero. This is a contradiction. 
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Finally, if g,-i < < gt then hi+i > hi < hi-i, and hence £' i+1 /£i+i > < 
£'i-\/ti-i- Therefore a, > 0, and likewise for > > g t . 
24.5. See [39], Theorem 2.19. 



LECTURE 25 

25.7. (a) Consider an edge E of P t of length l(t) with the dihedral angle <p(t). Let 
n(t) and ni(t) be the unit outer normal vectors to the faces F and F\ that share E, and 
let w(t) and wi(t) be the inner normal vectors to E in F and F\ having magnitude l(t). 
We claim that 

(30.5) -l(t)ip'(t) = n'(t) ■ + n[{t) ■ wi(t) 

where prime is d/dt. Indeed, choose a Cartesian coordinate system in the plane orthogonal 
to E, and let 9 and 9\ be the angles made by n(t) and n\(t) with the a;-axis. Then the 
vectors involved are 

n — (cos 9, sin 9), ni = (cos 0i, sin6>i), w = l(sin9, — cos 9), wi = l(— sin 9, cos 9). 

Therefore 

n — 0'{— sin 9, cos 9) = —j9'w, n[ = 9' 1 {— sin 6i, cos 6i) — j6[wi, 

and hence n' • w + n'i • Wi = Z(^i — #'). It remains to notice that 9i(t) — 9(t) = it — v 5 ( i )- 
For every face F of P, the sum of the inner normal vectors to the edges inside this 
face, whose magnitudes are equal to the lengths of these edges, is zero (compare with 
Exercise 30.8). Dot multiply this sum by the derivative of outer unit normal vector to F 
and sum over all faces. The resulting sum is zero and it equals the sum, over all edges, of 
the right hand sides of (30.5). The result follows. 



LECTURE 26 

26.2. (b) The interval consists of numbers [0.1... ]3 besides [0.1]3. The in- 
tervals an d consist of numbers [0.01... ]s and [0.21... ]3 besides [0.01]3 

and [0.21]3 and so on. Thus, C consists of all [O.di^rfs ■ • • [3 with di 7^ 1 for all i, and the 
numbers [O.diafe • • • d n -il]3 with di ^= 1 for 1 < i < n. 
(a) 

if = 20- ^ = [202] 3 ^ + ^ + ...) =[0.202202202... ] 3 , 

I - bh = ^ 

These number belong to C by the result in Part (b). 

26.3. (a) The definition of 7 shows that 

7[0.1...] 3 = [0.1] 2 , 7(0.001... ] 3 = [0.001] 2 , 

7[0.01...] 3 = [0.01] 2 , 7[0.021...] 3 = [0.011] 2 , 

7 [0.21...] 3 = [0.11] 2 , 7 [0.201...] 3 = [0.101] 2 , 

7[0.221...] 3 = [0.111] a . 

The apparent rule of forming 7(2;) is the following: if x = [0.did 2 d3 . . . ] 3 and there are l's 
among d'iS, then one should take the first n with d n = 1 and put 

7[0.did 2 <i3 . . . ] 3 = [O.ei . . . e„_il] 2 , a = y . 
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It is clear that to extend 7 to a continuous function on the whole interval [0, 1], one should 
put 

7[0. differ • • • ]3 = [0.eie 2 e 3 . . . ] 2 , e, = if d; ^ 1 for all i. 



(b) Since - = [0.020202 ... ] 3 , 7 



[0.010101. 



1 1 

] 2 = I + i5 + ' 



Since 



^ = 10 • ^ = [0.101101101 . . . ] 3 = [0.1] 2 = \. 

(c) Solved in (a). 

(d) Obviously follows from (a). 

26.4. (a) Denote the initial curve F by Fo, then put F\ 
It is clear from the construction that F n maps every interval 



Fo,F2 = Fi and so on. 
k + 

into a square 



4" ' 4™ 

of the form — , x — , ; moreover it maps 4" different intervals of the 

indicated form into 4™ different squares of the indicated form, and maps adjacent interval 
into adjacent squares. In addition to that, if 

'h li + l j [^2 l 2 + l 

2^ ' 2 n 2 n ' 



then 



" k k + 1" 






c 




" fc fe + 1" 






c 


4™ ' 4™ 



2 n 

'1 h + l 



2 n 

£2 £2 + 1 



for every p > n. This shows that for every i G [0, 1] and every p,q > n, the distance 



between F p (t) and ^(t) cannot exceed 



V2 



It remains to use standard theorems from 
Analysis; "Cauchy criterion" asserts that the sequence F n converges uniformly to a certain 
map [0, 1] — > [0, l] 2 , call it P, and there is another theorem which states that a uniform 
limit of a sequence of continuous function is continuous, so P is continuous. 

(b) Obviously, F n (^^j does not depend on the initial map F and 

Thus, the values of P at the fractions with denominators of the form 4™ do not depend on 
F, and, since these fractions form a dense subset of [0, 1], P does not depend on F at all. 

(c) It was shown in the solution of Part (a) that the image of P contains points in 



+ 1 



t2 £2 + 1 



, for every n. This shows that the image of P 
with respect to any continuous map is closed; 



every square ^ , ^ 

is dense in [0, l] 2 . But the image of [0, 1 
hence it must be the whole square [0, l] 2 . 

(e) Any t € [0, 1] may be presented as [O.A]2 where A is an infinite sequence of 
zeroes and ones (it may consist, after some moment, only of zeroes or only of ones). For a 
sequence A, we denote by A the "opposite" sequence: zeroes are replaced by ones and ones 
are replaced by zeroes. Suppose that _F[0.A]2 = ([0.B]2, [O.C] 2 ). According to Part (b) we 
may assume that F is "symmetric," that is, F [0.^4] 2 = ([0.B]2, [0.C]2- The definition of 
F given before Exercise 26.4 may be restated in the following way: 

F[0.00^] 2 = ([0.0C] 2 , [0.0B] 2 ), F[0.0L4] 2 = ([0.1S] 2 , [0.0C] 2 ), 
F[0.10A] 2 = ([0.LB] 2 , [0.1C] 2 ), F[0.1L4] 2 = ([0.0C*] 2 , [0.1B] 2 ). 

It is permissible to replace in all these formulas F and F by P. 

(f) The formulas above show that for any 2n-term sequence a of digits and 1, 
F n [0.aA] 2 = ([0.&-B'] 2 , [0.cC"] 2 ) or ([0.cC"] 2 , [0.bB'] 2 ) where b and c are n-term sequences 
depending only on a, B' = B or B, C = C or C. Repeating this procedure 4 times, 
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we obtain the following result: if P[0.A]2 = ([O.B]2, [O.C]2) then for any word a of even 
length, P[0.aaaaA] 2 = ([0.bB] 2 , [0.cC]2 where b and c depend only on a. This implies that 
if A is periodic, then B and C are periodic. (If A is periodic starting from some place, 
then A = aAi where A\ is (pure) periodic and a has even length. The formulas above 
show that in this case B and C also will be periodic starting from some place.) 

(d) \ = [0.010101 . . . ] 2 . Let A = 010101 ... and let P[0.A] 2 = ([O.B} 2 , [O.C} 2 ). Then 

P = P[0.01A] 2 = ([0.1B] 2 , [0.0C] 2 ) = ([O.B] 2 , [O.C] 2 ). 
Thus, B = 1B, C = 0C, hence B = 11111 . . . , C = 00000 . . . , hence 

p(i)=(i,o). 

i = [0.001100110011 . . . ] 2 . Let A = 001100110011 ... and let P[Q.A] = ([0.6] 2 , [O.C] 2 ). 
Then 

P[0.UA] 2 = ([0.0C] 2 , [O.LB] 2 ), P[0.0011A] = ([0.01B] 2 , [0.00C] 2 ), 

and hence 

P[0.00110011A] 2 = ([0.0110B] 2 , [0.0000C*] 2 ). 
Thus, B = 0110P, C = OOOOC*, hence B = 011001100110 . . . , C = 000000 ... and 

i ^([0.B] 2 ,[0.C] 2 )= 0,0 

(The reader who wants to perform a more challenging computation may try to com- 

n /l\ , ,. . . . /29 28\ , 

pute P - . According to our computations, it is — , — .) 
\7 J \65 65/ 

LECTURE 27 

27.1. (b) Answer: -(x 2 y - x - l) 2 - (x 2 - l) 2 . 

27.2. (b) Answer: x 2 (l + y) 3 + y 2 . 

27.3. See [27]. 

LECTURE 28 

28.8. See Figure 30.21: the focus of the parabola is also a focus of the ellipse; the 
two conies are smoothly connected by arbitrary curves. The construction makes use of 
Exercise 28.4. 




Figure 30.21. Trap for a parallel beam 

28.9. See [59, 88]. 
LECTURE 29 
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29.5. Let us compare the areas of two triples of circles inside an equilateral triangle 
with unit sides. The first configuration consists of three equal circles, each inscribed in its 
own angle of the triangle and tangent to two other circles. Let r be the radius of such a 
circle. Then 2r + 2\/Sr — 1 and hence r = 1/(2(1 + \/3)). The area of the three circles, 
Ai = 3tt/(4(1 + VS) 2 ) and Ai/tt « 0.1005. 

The second configuration consists of the circle inscribed into the triangle and two 
equal smaller circles, inscribed into two angles of the triangle and each tangent to the 
larger circle (but not to each other). Let R be the radius of the larger circle and r those 
of the smaller ones. Then R = \/3/6. The other radius is found from the equation 
(0.5 — V%r) 2 + (R — r) 2 = (R + r) 2 . This quadratic equation has two roots \/3/2 and 
\/3/18, and r is the latter. The total area of the three circles, A- 2 = 7r/12 + 7r/54 and 
Ailv ~ 0.1018. Thus the second configuration has a greater area. 

Another competing configuration consists of three circles that make a chain and are 
inscribed into one angle of the triangle. If the triangle is isosceles and very long and thin 
then the total area of such a chain of circles is almost twice as large as the total area of 
the Malfatti configuration of three circles, each tangent to two others . 

LECTURE 30 
30.1. Answer. 

(X - Xl)(x - X2) ■ ■ ■ (X - X k -l)(x - Xk+l) .-.{x — x n +i) 



k= 



I (Xk - Xl)(Xk - Xl) . . . (Xk - X k -l)(x k - Xk+l) ■ ■ ■ (Xk - X„+l) ' 



The polynomial is unique. 

30.2. Consider the polynomial g(x) from the proof of Theorem 30.2. Since g(x) = 1, 
its coefficients of degree 0, 1, ... ,n — 2 vanish. This gives all but one desired identities. 
The last one is obtained by equating the leading coefficient to 1. 

30.3. The claim will follows once we prove that 

h(qi) h(q 2 ) h{q n ) 
f'(qi) /'(») + + /'(<?„) 
where the sum is taken over all roots of a hyperbolic polynomial f(x) of degree n and h(x) 
is a polynomial of degree n — 2 or less. Since a polynomial h(x) is a linear combination of 
the monomials l,x,. . . ,x n ~ 2 , the result follows from Exercise 30.2. 

30.7. Recall the notion of elliptic coordinates introduced in Lecture 28. Given a 
confocal family of conies 

x 2 y 2 
a 2 + A + b 2 + A ~ ' 

the elliptic coordinates of a point (x, y) are the two values of A for which this equation 
holds. Fixing one of the elliptic coordinates describes an ellipse, fixing the other - a 
confocal hyperbola. It is easy to calculate that if (A, n) are the elliptic coordinates of a 
point (x, y) then 

2 _ (a 2 + A)(a 2 + M ) 3 (b 2 + A)(b 2 + M ) 
X a 2 ~b 2 ' V b 2 -a 2 

Let P — (x, y) be a point of the ellipse 

2 2 
x y 

h — = 1 

a 2 + b 2 ' 

and Q — (X, Y) — A(x, y) be the respective point of the ellipse 

2 2 

x , y = 1 



a 2 + \ b 2 + \ 

where X — x\Ja 2 + A/o, Y = yVb 2 + A/6. Let the elliptic coordinates of P be (0, /j); let 
those of Q be (A, 77). We want to prove that /1 = rj. Indeed, expressing the Cartesian 
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coordinates in terms of the elliptic ones, 



a 



(a 2 +fi) 2 _ (a 2 + A) a 2 (a 2 + (i) _ (a 2 + A) (a + rj) 



a 2 — b 2 ' a 2 a 2 — b 2 a 2 — b 2 

Hence fi = rj, as needed. 
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